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A key ingredient in our proof is the isotropic local semicircle law, which establishes optimal high- 
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of Wigner's semicircle law and v, w are arbitrary deterministic vectors. 
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1. Introduction 



Random matrices were introduced by Wigner [35] in the 1950s to model the excitation spectra of large atomic 
nuclei, and have since been the subject of intense mathematical investigation. In this paper we study Wigner 
matrices - random matrices whose entries are independent up to symmetry constraints - that have been 
deformed by a finite-rank perturbation. By Weyl's eigenvalue interlacing inequalities, such a deformation does 
not influence the global statistics of the eigenvalues. Thus, the empirical eigenvalue densities of deformed 
and undeformed Wigner matrices have the same large-scale asymptotics, and are governed by Wigner's 
famous semicircle law. However, the behaviour of individual eigenvalues may change dramatically under a 
deformation. In particular, deformed Wigner matrices may exhibit outliers, eigenvalues located away from 
the bulk spectrum. Such models were first investigated by Fiiredi and Komlos [29]. Subsequently, much 
progress [5-7, 11-13,28,32] has been made in the analysis of the spectrum of such deformed matrix models. 
See e.g. [32] for a review of recent developments. Analogous deformations of covariance matrices, so-called 
spiked population models, as well as generalizations thereof, were studied in [1,2,4]. 

In a seminal work [3], Baik, Ben Arous, and Peche investigated the spectrum of deformed (spiked) 
complex Gaussian sample covariance matrices. They established a phase transition, sometimes referred to 
as the BBP transition, in the distribution of the extremal eigenvalues. In [31], Peche proved a similar result 
for additive deformations of GUE (the Gaussian Unitary Ensemble). Subsequently, the results of [3] and [31] 
were extended to the other Gaussian ensembles, such as GOE (the Gaussian Orthogonal Ensemble), by 
Bloemendal and Virag [9,10]. We sketch the results of [3,9,10,31] in the case of additive deformations of 
GUE. For simplicity, we consider rank-one deformations, although the results of [3, 9, 10, 31] cover arbitrary 
rank-fc deformations. Thus, let H be an A x A GUE matrix, normalized so that its entries have variance 
A -1 . Let H{d) ■= H + dw* , where v is a normalized vector and d is independent of A. If d > 1 then the 
spectrum of H(d) consists of a bulk spectrum asymptotically contained in [—2,2], and an outlier, located at 
d + d^ 1 and having a normal law with variance of order A -1 . If d < 1 then there is no such outlier, and 
the statistics of the extremal eigenvalues of H(d) coincide with those of H. Thus, as d increases from 1 — s 
to 1 + e for some small e > 0, the largest eigenvalue of H(d) detaches itself from the bulk spectrum and 
becomes an outlier. 

The phase transition takes place on the scale d = 1 + wN^ 1 / 3 where w is of order one. This may be 
heuristically understood as follows. The largest eigenvalues of H are known to fluctuate on the scale A~ 2 / 3 
around 2. The critical scale for d, i.e. the scale on which the outlier is separated from 2 by a gap of order 
A" 2 / 3 , is therefore d= 1 + wN' 1 ' 3 (since in that case d + d- 1 = 2 + w 2 N~ 2 / 3 + 0(w 3 A" 1 )). In [3,9,10,31], 
the authors established the weak convergence as A — > oo 

Ny 3 (\ N (H(l + wN-V 3 ))-2) => A w , 

where \n(A) denotes the largest eigenvalue of A. Moreover, the asymptotics in w of the law A w was analysed 
in [3,8-10,31]: asw^ +oo, the law A w converges to a Gaussian; asw-^ — oo, the law A w converges to the 
Tracy- Widom-/3 distribution (where f3 = 1 for GOE and /3 = 2 for GUE). As mentioned above, the results 
of [3,9, 10,31] also apply to rank-fc deformations, where the picture is similar; each eigenvalue di G [—1, l] c 
gives rise to an outlier located around di + d" 1 , while eigenvalues di G (—1, 1) do not change the statistics 
of the extremal eigenvalues of H. 

The proofs of [3,31] use an asymptotic analysis of Frcdholm determinants, while those of [9, 10] use an 
explicit tridiagonal representation of H; both of these approaches rely heavily on the Gaussian nature of 
H. In order to study the phase transition for non-Gaussian matrix ensembles, and in particular address the 
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question of spectral universality, a different approach is needed. Interestingly, it was observed in [11-13] that 
the distribution of the outliers is not universal, and may depend on the geometry of the eigenvectors of A. 
The non-universality of the outliers was further investigated in [32] . 

In the present paper we take H to be a real symmetric or complex Hermitian Wigncr matrix, and A to 
be a rank-A: deterministic matrix whose symmetry class (real symmetric or complex Hermitian) coincides 
with that of H . We make the following assumptions on the perturbation A. 

(Al) The eigenvalues d u . . . , d k of A may depend on N; they satisfy - l| > (log A) clo s lo s N N~ 1 / 3 , i.e., 
on the scale of the phase transition, the eigenvalues of A are separated from the transition points by 
at least a logarithmic factor. 

(A2) The eigenvectors of A are arbitrary orthonormal vectors. 

Our main results on the spectrum of H + A may be informally summarized as follows. 

(Rl) The non-outliers "stick" to eigenvalues of the undeformed matrix H (Theorem 2.7). In particular, the 
extremal bulk eigenvalues of H + A are universal. 

(R2) We identify the distribution of the outliers of H + A (Theorem 2.14). 

A key ingredient in our proof is a generalization of the local semicircle law. The study of the local 
semicircle law was initiated in [21,22]; it provides a key step towards establishing universality for Wigner 
matrices [17,23,26,27,33,34]. The strongest versions of the local semicircle law, proved in [15,16,26], give 
precise estimates on the local eigenvalue density, down to scales containing N £ eigenvalues. In fact, as 
formulated in [26], the local semicircle law gives optimal high-probability estimates on the quantity 



where m(z) denotes the Stieltjes transform of Wigner's semicircle law and G(z) = (H — z) 1 is the resolvent 
of H . Starting from such estimates on (1.1), the two following facts are established in [26]. 

(i) The eigenvalue density is governed by Wigner's semicircle law down to scales containing N e eigenvalues. 

(ii) Eigenvalue rigidity: optimal high-probability bounds on the eigenvalue locations. 

Another key ingredient in the proof of universality of random matrices is the Green function comparison 
method introduced in [27] . It uses a Lindeberg replacement strategy, which previously appeared in the context 
of random matrix theory in [14,33,34]. A fundamental input in the Green function comparison method is a 
precise control on the matrix entries of G, which is provided by the local semicircle law. The Green function 
comparison method has subsequently been applied to proving the spectral universality of adjacency matrices 
of random graphs [15, 16] as well as the universality of eigenvectors of Wigner matrices [30]. 

In this paper, we extend the local semicircle law to the isotropic local semicircle law, which gives optimal 
high-probability estimates on the quantity 



where v and w are arbitrary deterministic vectors. Note that (1.1) is a special case obtained from (1.2) by 
setting v = and w = e j 7 where ej denotes z-th standard basis vector of C N . 



Gij(z) - Sijm(z) , 



(1.1) 



(v,(G(z)-m(z)l)w) 



(1.2) 
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1.1. Outline and sketch of proofs. In Section 2, we introduce basic definitions and state our results. In a 
first part, we state the isotropic semicircle law (Theorem 2.2) and some important corollaries, such as the 
isotropic derealization estimate (Theorem 2.5). The second part of Section 2 is devoted to the spectra of 
deformed Wigner matrices. Our main results are deviation estimates on the eigenvalue locations (Theorem 
2.7) and the distribution of the outliers (Theorem 2.14). In subsequent remarks we discuss some special 
cases of interest, in particular making the link to the previous results of [11-13,32]. 

The remainder of this paper is devoted to proofs. As it turns out, the proof of the isotropic local semicircle 
law is considerably simpler if the third moments of the matrix entries of H vanish. This case is dealt with in 
Section 3. The proof is based on the Green function comparison method and the local semicircle law of [26]. 
In Section 4, we give the additional arguments needed to extend the isotropic local semicircle law to arbitrary 
matrix entries. We remark that the Green function comparison method has been traditionally [16,27,30] 
used to obtain limiting distributions of smooth, bounded, observables that depend on the resolvent G. In 
this paper we use it in a novel setting: to obtain high-probability bounds on a fluctuating error. 

In Section 5 we use the isotropic semicircle law to obtain an improved estimate outside of the classical 
spectrum [—2, 2], and prove the isotropic derealization result which yields optimal high-probability bounds 
on projections of the eigenvectors of H onto arbitrary deterministic vectors. 

Section 6 is devoted to the proof of deviation estimates for the eigenvalues of H + A. Our starting 
point for locating the eigenvalues is a simple identity from linear algebra (Lemma 6.1) already used in the 
works [5-7,32]. Similar identities were also used in [1,2,4] for deformed covariance matrices. Using such 
identities, the study of the eigenvalue distribution of the deformed ensemble can be reduced to the study of 
the resolvent. In our case, this study of the resolvent is considerably more involved because we allow very 
general perturbations and also identify the distribution of non-outliers. In order to illustrate our method, we 
first consider the rank-one case in Theorem 6.3. The general rank-fc case is based on a bootstrap argument - 
in which the eigenvalues d = (di, . . . , dfc) of A are varied - which may be summarized in the following three 
steps. 

(i) For arbitrary d, we establish a "permissible region" T(d) C R whose complement cannot contain 
eigenvalues of H + A. The region T(d) consists essentially of small neighbourhoods of the extremal 
eigenvalues of H as well as of small neighbourhoods of the classical outlier locations di + dj 1 for i 
satisfying \di\ > 1. 

(ii) We fix d to be independent of N. In this simple case, we prove that each permissible neighbourhood 
of a classical outlier location di + d^ 1 contains exactly one eigenvalue of H + A. Moreover, we prove 
that the non-outliers of H + A stick to eigenvalues of H. 

(iii) In order to allow arbitrary TV-dependent d's, we construct a continuous path (d(i)) t6 [ .i] that takes an 
iV-indepcndcnt initial configuration d(0) to the desired iV-dependent configuration d = d(l). Using (i), 
(ii), and the continuity of the eigenvalues of H + A(t) as functions of t, we infer that the conclusions 
of (ii) remain valid for all d(t) where t £ [0,1], and in particular for d(l). (Here A(t) denotes the 
perturbation with eigenvalues d(t).) 

Finally, Section 7 contains the proof of Theorem 2.14, the distribution of the outliers. The proof consists 
of four main steps. 

(i) We reduce the problem of identifying the distribution of an outlier to that of analysing the distribution 
of random variables of the form (v,G(#)v), where 6 ■= d + d^ 1 and d is an eigenvalue of A with 
associated eigenvector v. The argument is based on a precise control of the derivative of G(z) and 
second-order perturbation theory. 
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(ii) We consider the case where H is Gaussian. Using the unitary invariance of the law of H, we prove 
that (v,G(6*)v), when appropriately rescaled, converges to a normal random variable. 

The remainder of the proof consists in analysing the difference between the general Wigner case and the 
Gaussian case. Ultimately, we shall apply the Green function comparison method to expressions of the form 
(v,G(0)v) (Step (iv) below). However, this method is only applicable if HvHoo is sufficiently small (in fact, 
our result shows that the Green function comparison method must fail if HvH^ is not small). We therefore 
have to perform a two-step comparison. 

(iii) Let H be the Wigner matrix we are interested in. We introduce a cutoff En (equal to (p~ D in the 
notation of Section 7.3). We define H as the Wigner matrix obtained from H by replacing the (i, j)-th 
entry of H with a Gaussian whenever \vi\ < En and |t>j| < em- We choose En large enough that most 
entries of H are Gaussian. We shall compare H with a Gaussian matrix V via the intermediate matrix 
H. In this step, (iii), we compare H with V. 

Our proof relies on a block expansion of H, which expresses the distribution of the difference 

in terms of a sum of independent random variables (ri, . . . ,T 6 in the notation of Section 7.3) whose 
laws may be explicitly computed. 

(iv) In the final step, we use the Green function comparison method to analyse the difference 

(v,(H-9)-\)-(v,(H-9)-\). 

By definition of H, whenever the entry of H differs from that of H, we have \vi\ ^ £n and 

\vj\ ^ £ n- As a consequence, as it turns out, the Green function comparison method is applicable. Of 
special note in this comparison argument is a shift in the mean of the outlier (arising from the second 
term on the right-hand side of (7.50)), depending on the third moments of the entries of H. 

Acknowledgements. We are grateful to Alex Bloemendal, Paul Bourgade, Laszlo Erdos, and Horng-Tzer 
Yau for helpful comments. 



2. Results 



2.1. The setup. Let = H = (hij) be an N x TV matrix; here uj denotes the running element in probability 
space, which we shall almost always drop from the notation. We assume that the upper-triangular entries 
(h^ : i ^ j) are independent complex-valued random variables. The remaining entries of H are given by 
imposing H = H* . Here H* denotes the Hermitian conjugate of H. We assume that all entries are centred, 
Ehij = 0. In addition, we assume that one of the two following conditions holds. 

(i) Real symmetric Wigner matrix: hij <G M for all i, j and 
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(ii) Complex Hermitian Wigner matrix: 

E4 = ^, E|%| 2 = ^, E^. = (i^j). 

We use the abbreviation GOE/GUE to mean GOE if if is a real symmetric Wigner matrix with Gaussian 
entries and GUE if if is a complex Hermitian Wigner matrix with Gaussian entries. We assume that the 
entries of if have uniformly subexponential decay, i.e. that there exists a constant •& > such that 

¥(VN\hij\^x) ^ _1 exp(-a;*) (2.1) 

for all Note that we do not assume the entries of if to be identically distributed. 

The following quantities will appear throughout this paper. We choose a fixed but arbitrary constant 
£ ^ 3. We define the logarithmic control parameter 

m = if := (\ogN) lo ^ N . (2.2) 

The parameter £ will play the role of a fixed positive constant, which simultaneously dictates the power of 
ip in large deviations estimates and characterizes the decay of probability of exceptional events, according to 
the following definition. 

Definition 2.1 (High probability events). Let ( > 0. We say that an N -dependent event S holds with 
C-high probability if there is some constant C such that 

P(E C ) < 7V c exp(-/) (2.3) 

for large enough N . 

Introduce the spectral parameter 

z = E + irj, 

which will be used as the argument of Sticltjcs transforms and resolvents. In the following we shall often use 
the notation E = Rez and tj — Imz without further comment. Let 

e(t) ■= ^V[4-e 2 ]+ KeR) 

denote the density of the local semicircle law, and 

m{z) := J0- z dZ (^[-2,2]) (2.4) 

its Stieltjcs transform. To avoid confusion, we remark that the Stieltjes transform m was denoted by m sc 
in the papers [15-27], in which m had a different meaning from (2.4). It is well known that the Sticltjcs 
transform m satisfies the identity 

m(z) + — ^ +z = 0. (2.5) 
m(z) 

For rj > we define the resolvent of if through 

G{z) := (H-z)- 1 . 
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We use the notation v = (vi)fL 1 G C N for the components of a vector. We introduce the standard 
scalar product (v,w) := Y^i^iWi, which induces the Euclidean norm ||vj| := \/(v, v). By definition, v is 
normalized if ||v|| = 1. 

We denote by C a generic positive large constant, whose value may change from one expression to the 
next. If this constant depends on some parameters a, we indicate this by writing C a . Finally, for two 
positive quantities An and % we use the notation Ajv x Bjy to mean C~ 1 Ajy ^ £?jv ^ CAn for some 
positive constant C . 

2.2. The isotropic local semicircle law. For ( > let 

S(C) := {z G C : \E\ < E , tp^N- 1 sC n sC S} . (2.6) 
For z G S(() define the control parameter 



T , . lmm(z) 1 

Our first main result is on the convergence of G(z) to m(z)l. 

Theorem 2.2 (Isotropic local semicircle law). Fix ( > 0. Then there exists a constant such that 

|(v,G(z)w)-m(z)(v,w)| < ^*( z )||v||||w|| (2.7) 

holds with (-high probability for all deterministic v, w G C N under either of the two following conditions. 

A. The spectral parameter z G S(C^) is arbitrary, and the third moments of the entries of H vanish in the 
sense that 

E/i?. = Eh^jhij =0 = 1, . . . , N) . (2.8) 

B. The spectral parameter z G S(Cf) satisfies 

*(z) 3 sC (^iV- 1 / 2 (2.9) 

/or some /arge enough constant Co depending on (. 

Away from the asymptotic spectrum [—2,2], Theorem 2.2 can be strengthened as follows. 

Theorem 2.3 (Isotropic local semicircle law outside of the spectrum). Fix ( > and £ ^ 3. 
Then there exist constants C\ and Cq such that for any 

E G [-Z,-2-tp Cl N- 2/3 ]lj[2 + ip Cl N- 2/3 1 Y], 

any n G (0, £], and any deterministic v,w G C N we have 



|<v,G(z)w}-m(z)(v,w)| < ^ /^£)|| V |||| W ||_ (210) 



with (-high probability. 
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Remark 2.4. Using a simple lattice argument combined with the Lipschitz continuity of z h-> G(z), one can 
easily strengthen the statement (2.7) of Theorem 2.2 to a simultaneous high probability statement for all z, 
as in (3.16) below. For more details, see e.g. Corollary 3.19 in [15]. 
Similarly, mimicking the proof of Lemma 7.2 below, we find 

sup{|d z (v,G(z)w)| : 2 + ^ Cl JV- 2 / 3 < |£| sC £, < |r?| sC s} sJiV (2.11) 

with £ - high probability, from which we infer that the statement (2.10) of Theorem 2.3 holds with £-high 
probability simultaneously for all z — E + irj satisfying the conditions in (2.11). 

For an N x TV matrix A we denote by \i(A) ^ A 2 (A) ^ • • • < Xn(A) the nondecreasing sequence of 
eigenvalues of A. Moreover, we denote by a (A) the spectrum of A. It is convenient to abbreviate the 
(random) eigenvalues of H by 

A Q := A Q (_ff) . 

Denote by u^ 2 \ . . . , G C N the normalized eigenvectors of H associated with the eigenvalues Ai < 
A 2 ^ • • • ^ \n- Our next result provides a bound on (u^"' , v) for arbitrary deterministic v. 

Theorem 2.5 (Isotropic delocalization). Fix( > 0. Then there is a constant Q such that the following 
holds for any deterministic and normalized v <G C N . 

(i) For any integers a and b satisfying 1 < a < b ^ N/2 and 

b-a > 2<p c ° (fcVSjv-i/e + (afcjVajv-i/a^ (2 .12) 

we have 

^-]T|(u(«),v)| 2 < ^iV" 1 (2.13) 

a— a 

with (-high probability. Here C is the constant from Theorem 2.2. By symmetry, a similar result holds 
for the eigenvectors a ^ N/2. 

(ii) If the third moments of the entries of H vanish in the sense of (2.8), then we have the stronger 
statement 

sup|(u^,v)| 2 sC if^N' 1 (2.14) 

a 

with (-high probability. 

Remark 2.6. Theorem 2.5 implies that the coefficients of the eigenvectors of H are strongly oscillating. 
In order to see this, let a — 1, . . . , N. If the third moments of the entries of H do not vanish, we require 
that a (/ [ip-^N 1 ' 2 ^ - ip-^N 1 ' 2 ]. Then choosing v = iV" :L / 2 (l, . . . , 1) and v = e; for i = 1, . . . , TV in 
Theorem 2.5 yields 

s$ <^ c «, max K (a) | sC ^^N- 1 ' 2 (2.15) 
with C-high probability. The second inequality implies 

N N 

]>>( Q) | > .p-^N^Y}^? = ^'N 1 ' 2 

i=l i=l 



N 

E 



,(«) 
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with £-high probability. Compare this with the first inequality of (2.15). 

This behaviour is not surprising. In the GOE/GUE case, it is well known that each eigenvector u( Q ) is 
uniformly distributed on the unit sphere, so that its entries asymptotically behave like i.i.d. Gaussians. 



2.3. Finite-rank deformation of Wigner matrices. Let k e N be fixed, V be a deterministic N x k matrix 
satisfying V*V = 1, and d\,...,dk € K \ {0} be deterministic. We allow d\ = d\{N), . . . , d k = d k {N) to 
depend on N. We also use the notation V = [v^, . . . , v^ fe ^], where v«,...,v( fc ) e C N are orthonormal. 
Define the rank-fc perturbation 

k 

VDV* = ^d iV «(v«)*, D = diag(di,...,d fc ). 

i=l 

We shall study the spectrum of the deformed matrix 

H := H + VDV*. 

We abbreviate the eigenvalues of H by 

\i a ■= X a (H) . 

In order to state our results, we order the eigenvalues of D, i.e. we assume that d\ < . . . < dk- Define 
the numbers 

fc ± := #{z : ±d t > 1} . 

As we shall see, k~ is the number of outliers to the left of the bulk and k + the number of outliers to the 
right of the bulk. We shall always assume that k~ and k + are independent of N. 
Let 

O := {i e {1, . . . , k} : \di\ > 1} = {1, . . . , k~, k - k + + 1, . . . , k} (2.16) 
denote the k~ +k + indices associated with the outliers. For i e O abbreviate the associated eigenvalue index 

by 

jN-k + i iH>fc-fc+ + l 

I 2 it « ^ fc . 

Finally, for d & R \ (-1, 1) we define 

6(d) := d+^. (2.18) 

Theorem 2.7 (Locations of the deformed eigenvalues). Fix ( > 0, if > 0, fc e N, and < b < 1/3. 
Then there exist positive constants C2 and C3 such that the following holds. 
Choose a sequence tp = tpN satisfying 1 ^ tp ^ N b . Suppose that 

\di\ < E-l, |K|-1| > <£ C W 1/3 (2.19) 
/or alii = 1, . . . , k. Then for i G O we have 

\»a(i) ~ 6(di)\ < ^AT-^d^l _ 1)1/2 (2 20) 

wii/i (-high probability. Moreover, 

W~K-k-\ < V _1 ^~ 2/3 /or + l < a < p*, (2.21a) 

l^-A Q+fe+ | < ^N- 2 / 3 for N-p K ^a^N-k+, (2.21b) 
wzi/i (-high probability. 
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Remark 2.8. In [12], Capitaine, Donati-Martin, and Feral proved that li a U) ~^ @(di) almost surely for all 
i E O, under the assumptions that (i) D does not depend on N and (ii) the law of the entries of H is symmetric 
and satisfies a Poincare inequality. Subsequently, the assumption (ii) was relaxed by Pizzo, Renfrew, and 
Soshnikov [32]. In fact, in [32] the authors proved, assuming (i), that the sequence \fN{^ a u) — 6{di)) is 
bounded in probability for all i € O. 

In [5,6], Benaych-Georges, Guionnet, and Ma'ida considered deformations of Wigner matrices by finite- 
rank random matrices whose eigenvalues are independent of N and whose eigenvectors are cither independent 
copies of a random vector with i.i.d. centred components satisfying a log-Sobolev inequality or are obtained 
by Gram-Schmidt orthonormalization of such independent copies. For these random perturbation models, 
they established eigenvalue sticking estimates similar to (2.21). 

Remark 2.9. Provided one is only interested in the locations of the outliers, i.e. (2.20), one can set tp = 1 
in Theorem 2.7. 

We shall refer to the eigenvalues in (2.20), i.e. /zi, . . . fi k - , ^N-k++ii ■ ■ ■ >Miv, as the outliers, and to the 
eigenvalues in (2.21), i.e. ^ fc - +1 , . . . , [i^k , /j, n _ v> k , . . . , ^N~k+ i as tne extremal bulk eigenvalues. 

Remark 2.10. The phase transition associated with di happens on the scale d j = 1 + fljA'' -1 / 3 where otj is 
of order one. The condition (2.19) is optimal (up to powers of <p) in the sense that the power of N in (2.19) 
cannot be reduced. Indeed, in [3,9,10,31] it is established that, for rank-one 1 deformations of GOE/GUE 
with d = 1 + aN -1 / 3 and a of order one, iijv fluctuates on the scale TV -2 / 3 and its distribution differs from 
that of An- Hence in that case (2.21) cannot hold for tp ^> 1. See also Remark 2.13 below for a more detailed 
discussion of the qualitative behaviour of eigenvalues of H as di crosses a transition point. 

Note that the location 0(di) of the outlier associated with di = 1+aiN^ 1 / 3 satisfies 9{di) = 2 + N~ 2 / 3 a 2 + 
(9(afiV _1 ). In comparison, the largest eigenvalue of H fluctuates on a scale iV~ 2 / 3 around 2. 

Remark 2.11. An immediate corollary of Theorem 2.7 is the universality of the extremal bulk eigenvalues 
of H. In other words, under the assumption ||dj| — 1| ^ tp C2+1 N~ 1/>3 for all i, the statistics of the extremal 
bulk eigenvalues of H coincide with those of GOE/GUE. 

Indeed, choosing tp = ip in Theorem 6.3 and invoking the edge universality for the Wigner matrix H 
proved in Theorem 1.1 of [30] (for similar results, see also [16,26]), we find for alH € N and all bounded and 
continuous / that 



lim 



E/(iV 2 / 3 ( Mfe - +1 + 2), . . . , N 2 ' 3 (p k - +t + 2)) - E G /(iV 2 / 3 (A 1 + 2), . . . , iV 2 / 3 (A, + 2)) 



0. 



where E G denotes expectation with respect to the N x N GOE/GUE matrices. A similar result holds at the 
other end of the spectrum. 

Remark 2.12. Theorem 2.7 was formulated for deterministic perturbations. However, it extends trivially 
to the case where V is random, independent of H, with arbitrary law satisfying V*V = 1. 

Remark 2.13. The parameter ip describes how strongly the extremal bulk eigenvalues of H stick to extremal 
eigenvalues of H. If di is within distance CN^ 1 ^ 3 of a transition point ±1, one does not expect the eigenvalues 
of H to stick to the eigenvalues of H. For very weak sticking on the scale A r_2 / 3 ( y 5~ 1 , corresponding to tp = tp, 



1 For simplicity of presentation, we consider rank-one deformations, although the results of [3,9,10,31] hold for rank-fe 
deformations. 
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the eigenvalues di have to satisfy | \di\ — l| > ip C2+1 TV -1 / 3 . In particular, we may allow outliers at a distance 
^202+2^-2/3 f rom ^he spectral edge. 

On the other hand, in order to obtain strong sticking on the scale N~ 1+e , corresponding to ip = -/V 1 / 3-6 , 
the eigenvalues di have to satisfy — l| > (p C2 N~ £ . Now the outliers have to lie at a distance of at least 
N 2C 2 -2e £ rom ^ S p ec t ra i e dge. 

Thus, Theorem 2.7 gives a clear picture of what happens to the extremal bulk eigenvalues as di passes a 
transition point ±1. For definiteness, consider the case where di is varied from 1 — c to 1 + c for some small 
c > 0, and all other eigenvalues of D are kept constant. Consider an extremal bulk eigenvalue near +2, say 
fj, a . By Theorem 2.7, for di ^ 1 — tp C2+1 TV" 1 / 3 , pi a sticks to A^ where j3 ■= a + k + . As di approaches 1, the 
eigenvalue fi a progressively detaches itself from A^. Theorem 2.7 allows one to follow this behaviour down 
to \di — 1| = tp C2+1 N~ 1 / :i . Below this scale, as di passes 1, the eigenvalue [i a "jumps" from from the vicinity 
of \$ to the vicinity of A,g+i. This jump happens in the range di G [1 — ip C2+1 TV -1 / 3 , 1 + tp C2+1 N~ 1 / 3 }. 
After the jump, i.e. for di > 1 + (p C2+1 N~ 1 / :i , the eigenvalue pi a sticks to A,g + i instead of \p, provided that 
/3 < N. If /3 = N, then \i a escapes from the bulk spectrum and becomes an outlier. This jump happens 
simultaneously for all extremal bulk eigenvalues near +2, and is accompanied by the creation of an outlier. 
This may be expressed as (k°,k + ) i->- (fc° — l,fc+ + 1). Meanwhile, the extremal bulk eigenvalues on the 
other side of the spectrum, i.e. near —2, remain unaffected by the transition, and continue sticking to the 
same eigenvalues of H they stuck to before the transition. 

Next, we identify the distribution of the outliers. We introduce the customary symmetry index /3, by 
definition equal to 1 if H is real symmetric and 2 if H is complex Hermitian. In order to state our result, 
we define the moment matrices = (M^ 3 ) and — (M^- ) of H through 

M (3) := N m^ h ..f hij) ; M w := jv 2 e|^.| 4 . 

By definition of H, the matrices and are Hermitian. Moreover, by (2.1) they have uniformly 

bounded entries. For v = (v^ e C N define 

R W - ^]T(mW-4 + /3)M 4 , 

i,j 

s ( v ) : = ^I> M SV ( 2 - 22 ) 

The functions Q, R, and 5* are bounded on the unit ball in C", uniformly in N. 

Theorem 2.14 (Distribution of the outliers). There is a constant C 2 such that the following holds. 
Suppose that 

\di\ < E-l, |K|-1| > V C2 N- 1 ' 3 (2.23) 
for alii = 1, . . . , k. Suppose moreover that for all i g O we have 

mm\di-dj\ > ^^-1/2(^.1 _ i)-i/2 (2.24) 
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For i G O define the random variable 

TU := (N + l)(l^|-l) 1/2 ( + dTJ 

and T i; a random variable independent of Hi with law 

T + m + lfm - "(^ + ^)) ■ 

Then we have, for all i € O and all bounded and continuous f , 

= 0. (2.25) 

Note that, by a standard approximation argument, (2.25) also holds for f{x) = l(x < a) where a € K; 
hence the convergence (2.25) may also be stated in terms of distribution functions. 

Remark 2.15. In [11], Capitaine, Donati-Martin, and Feral identified the law of the outliers of deformed 
Wigner matrices subject to the following conditions: (i) D is independent of N but may have degenerate 
eigenvalues; (ii) the law of the matrix entries of H is symmetric and satisfies a Poincare inequality; (iii) 
the eigenvectors of the deformation belong to one of two classes, corresponding roughly to either partially 
delocalized eigenvectors or strongly localized eigenvectors. Subsequently, the assumption (ii) was relaxed by 
Pizzo, Renfrew, and Soshnikov in [32]. (But assumption (iii) imposes that 5(vW) = Q(vW) = still holds 
for the results of [32].) 

Remark 2.16. The condition (2.24) has the following interpretation. Let i e O and assume for definiteness 
that di > 1. If j is not associated with an outlier on the right-hand side of the bulk, i.e. if dj < 1, then di—dj 
is bounded from below by the right-hand side of (2.24), as follows from (2.23). Hence the condition (2.24) is 
only needed to ensure that the outliers are not to close too each other; in fact, this condition is optimal (up 
to the factor (p C2 ) in guaranteeing that the distributions of the outliers have essentially no overlap. Indeed, 
by Theorem 2.7 we know that lies with £-high probability in an interval of length 2ip C3 N^ 1 ^ 2 (d i - l) 1 / 2 
centred around 6{di). Moreover, differentiating (2.18) yields 

9{dj)-9{di) x {di - l)(dj - di) . 

Imposing the condition \6(dj) — 9(di)\ tp C3 N~ 1 / 2 (di — l) 1/>2 leads to (2.24) (with C2 increased if necessary 
so that Ci > C3). In fact, in [3,31,32] it was proved (for D independent of N) that the distribution associated 
with degenerate outliers is not Gaussian. 

The following remarks discuss some special cases of interest. In order to simplify notations, we set k = 1 
and write d = di, v = v^, II = IT, and T = Ti. 

Remark 2.17. In the GOE/GUE case, we have = and M\f = (4 - f3) + <%(17 - 8/3). Thus we get 
that Q(v) = S(v) = and i?(v) = 0(N^ 1 ). Since iV 1//2 (v, Hv) is a centred Gaussian with variance 2/3~ 1 , 
we therefore find that II + T has asymptotically 2 the distribution of a centred Gaussian with variance 

2(M| + 1) 2 (M|-1) 2(M| + 1) = 2(\d\ + l) 

(3d A [3d 4 (3d 2 ' 

2 See Section 7.2 for precise definitions and more details. 



lim 

Af->oo 



Ef^N^Qdil - ir 1/2 (Ma W - fl(di))) - E/OIi + T, 
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Remark 2.18. If ^iV" 1 / 3 < \\d\ - l| = o(l) then II + T converges weakly to a centred Gaussian with 
variance 4/3" 1 . As an outlier approaches the bulk spectrum, the dependence of its distribution on the details 
of H and v is washed out. Therefore, unlike outliers located at a distance of order one from the bulk 
spectrum, outliers close to ±2 exhibit universality. Moreover, as an outlier approaches the bulk, its variance 
shrinks from iV" 1 (for d-lxl) to A~ 4 / 3 (for d-lx iV" 1 / 3 ). 

Remark 2.19. If max^i^l ->■ as TV ->• oo, we find that Q(v) and i?(v) ->• as N ->• oo. Moreover, the 
Central Limit Theorem implies in this case that N^ 2 (v,Hv} converges in distribution to a centred Gaussian 
with variance 2/3" 1 . Therefore II + T has asymptotically the distribution of 

/ (MI + i)(M|-i)V 2 ,g(v) 2(M| + i) 

Thus, the only difference to the GOE/GUE case is a shift caused by the nonvanishing third moments 
of H. For example, if M^' — mS 3 ^ e 1 is independent of i and j, and v = A r_1 / 2 (1, . . . , 1), we find 
5(v) = m< 3 ) +0(N- 1 ). 

Remark 2.20. Typically, R(v) is nonzero if v has entries which do not converge to zero. An example for 
which Q(v) is nonzero is M^ 3) = e R independent of N and v = (2^/ 2 , (2N-2)- 1 / 2 , . . . , (27V-2)- 1 / 2 ), 
in which case we have Q(v) = 2~ 3 / 2 to( 3 ) + 0{N^ 1 / 2 ). 

Remark 2.21. Consider now the case where maxj|t;j| does not tend to zero as N — > oo. For definiteness, 
let v = (u,w), where the dimension of u is constant and maxj|u;j| — > as N — > oo. By the Central 
Limit Theorem and a short variance calculation, iV 1 / 2 (v,i?v) has asymptotically the same distribution as 
A rl / 2 (u, Hu) + 2/3~ 1 (l — ||u|| 2 )(l + 2||u|| 2 )Z, where Z is a standard normal random variable independent of 
H. 

Let us take for example v = (1, 0, . . . , 0). Then 11 + Y has asymptotically the same distribution as LT' + T', 
where 

n' := {\d\ + i)(\d\-i) 1 ' 2 d- 2 N 1 i 2 h 11 , 

and T' is a centred Gaussian, independent of n', with variance 



3. Proof of Theorem 2.2, Case A 

In this section we prove Theorem 2.2 in the case A, i.e. where the first three moments of the entries of H 
coincide with those of GOE/GUE. 

We start by introducing the following notations we shall use throughout the rest of the paper. For an 
N x N matrix A and v, w e C N we abbreviate 

A vv , ■■= (v, Aw) . 

We also write 

A = A ■ A = A- A = A - ■ 
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where G C denotes the i-th standard basis vector. 

For definiteness, we consider the case where H is a complex Hermitian Wigner matrix; the proof for real 
symmetric Wigner matrices is the same. By Markov's inequality, in order to prove Theorem 2.2 it suffices 
to prove the following result. 

Proposition 3.1. Assume (2.8) and let ( > be fixed. Then there exists a constant such that, for all 
n < <p<- , all deterministic v,w e C N , and all z e S(C^), 



the distance from E to the spectral edges ±2. In the following we use the notations 

z = E + , k = ke 

without further comment. The following lemma collects some useful properties of to, the Stieltjes transform 
of the semicircle law. 

Lemma 3.2. For \z\ ^ 2E we have 




(3.1) 



The rest of this section is devoted to the proof of Proposition 3.1. 



3.1. Preliminaries. We start with a few basic tools. For £eR define 



K E := \\E\ - 2 



(3.2) 



m(z)\ x 1 



1 -m{zf \ x 



(3.3) 



Moreover, 




(Here the implicit constants depend on 



Proof. The proof is an elementary calculation; see Lemma 4.2 in [27]. 



□ 



In addition to ^, we shall make use of a larger control parameter $, defined as 





(3.4) 



From Lemma 3.2 we find, for any z satisfying \z\ ^ 2S 




(3.5) 



where An < means An ^ CBm for some constant C. 

We shall often need to consider minors of H, which are the content of the following definition. 
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Definition 3.3 (Minors). For T C {1, . . . , N} we define by 

(H^) tJ := T)l(j i Y)Ki . 

Moreover, we define the resolvent of through 

gW(z) := l(z^T)l(.^T)(i/( T )-z)^. 

We also set 

(T) 

E : = E ■ 

WhenT — {a}, we abbreviate ({a}) fry (a) m i/ie above definitions; similarly, we write (ab) instead of({a,b}). 

We shall also need the following resolvent identities, proved in Lemma 4.2 of [25] and Lemma 6.10 of [16]. 
Lemma 3.4 (Resolvent identities). For any i,j,k we have 



Moreover, for i ^ j we have 



Gij = G W + ^1 . (3 .6) 



Gij = -Gu h ik G k l = -Gjj G ik h k j ■ (3.7) 



k k 

These identities also hold for minors H^' . 
It is an immediate consequence of (3.6) that 



G VW ^G« + G ^ W . (3.8) 
Gkk 



Moreover, we introduce the notations 



Qwi - G^ k hki , ; = ~E^ ife( ^fev' (3-9) 

so that 

G vl = Gi^Vi+Gvi), G tv = G«(«i + ftv) (3.10) 

by (3.7). 

Next, we record some basic large deviations estimates. 

Lemma 3.5 (Large deviations estimates). Let m, . . . , a/v, &i, • ■ • , b M be independent random variables 
with zero mean and unit variance. Assume that there is a constant ■& > such that 

F(\a,i\}tx) < ti' 1 exp(-x^) (i = l,...,N), 

P(N > x) < ?T 1 exp(- : z:' 5 ) (i = 1, . . . , M) . (3.11) 
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Then there exists a constant p = > 1 such that, for any (, > and any deterministic complex numbers 
Ai and Bij, we have with (-high probability 



N 



E^ a * 

i=i 



E a i B H a 3 



/ N \ V 2 



1/2 



1/2 



1/2 



(3.12) 
(3.13) 
(3.14) 
(3.15) 



PROOF. The estimates (3.12) - (3.14) we proved in Appendix B of [25]. The estimate (3.15) follows easily 

from (3.12) in two steps. Defining A { := T,j B ij b j, ( 3 - 12 ) y ields \ A i\ < ¥> pC (Ej\ B ij\ 2 ) 1/2 with C"high 
probability. Since the families {Ai} and {ai} are independent, (3.15) follows by using (3.12) again. □ 

Finally, we quote the following results which are proved in Theorems 2.1 and 2.2 of [26]. (Recall that we 
use the notation m for the quantity denoted by m sc in [26].) 

Theorem 3.6 (Local semicircle law). Fix ( > 0. Then there exists a constant Cq such that the event 

H {^jGvW-tiMz)]^^^)} (3-16) 

holds with (-high probability. 

Denote by 71 < 72 < • • • < -fN the classical locations of the eigenvalues of H, defined through 

g(x)dx = a (1 sC a sC N) . (3.17) 

-00 

Theorem 3.7 (Rigidity of eigenvalues). Fix ( > 0. Then there exists a constant such that 

|A q -7q| < ^ C<: (mm{a,^ + l-a}) _1/3 ^ 2/3 
for all a = 1,...,N with (-high probability. 

3.2. Estimate of G V i. After these preparations, we may prove the key tool behind the proof of Proposition 
3.1. It will be used as input in the Green function comparison method, throughout Sections 3.3, 3.4, and 4. 
Let us sketch its importance in the Green function comparison method. Anticipating the notation from the 
proof of Lemma 3.9, we shall have to estimate quantities of the form 

(S-R) vv = (-N- 1 / 2 RVR + N- 1 RVRVR + ---) vv , 
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where the right-hand side is a resolvent expansion of the left-hand side. The first matrix product on the 
right-hand side may be written as 

(RVR) VV = R va VabRbv + RvbVbaRav 

(again anticipating the notation from the proof of Lemma 3.9). Lemma 3.8 will be used to estimate the 
resolvent entries of the form R va in such error estimates. These resolvent entries arise whenever the Green 
function comparison method is applied to the component (-) vv of a resolvent. 

Lemma 3.8. For any C > there exists a constant such that 



\G vi {z)\ + \g iw {z)\ + \G vi (z)\ + \G iv {z)\ < <P C ^ ImG ^ {z) +C\vi\ (3.18) 
holds with (-high probability for all z € S(C^). 

Proof. Since the families {hki)k and (G^)fe are independent, (3.9), (3.12), and (2.1) yield 

(i) \ 1/2 



\G vi \ < v Ci 



(stl^lj 



with C-high probability for some constant Cq. By spectral decomposition one easily finds that 

k 

From (3.3) and (3.16) we find that 

\Gu\ s? C (3.19) 
with C-high probability provided that r\ > ip c < for some large enough Cq. Setting 

X ■■= \Q V i\ + \Qi V \ , 

we therefore conclude, using first (3.8) and then (3.10), that 



v „ c r /l m G vv + \Gu\ \G V i/Gu\ \G iv /Gii\ \ c I Im G vv c X 
x ^ f \ ITr ^ <P \ —Tt Vl P —f^+v 



\ Nti I y Nr] y/Nrj y/Nrj 

with C-high probability. Thus we find for r\ ^ ip 2C <N~ 1 



r, /ImGw , 
X<^___ + N 

with C-high probability, and the claim for \Gvi\ + \Giv\ follows. The claim for |G V »| + \Gi V \ follows using 
(3.10) and (3.19). □ 
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3.3. Estimate of ImG vv , The first step in the proof of Proposition 3.1 is the following estimate of ImG vv . 
Note that Im G vv is a nonnegative quantity, as may be easily seen by spectral decomposition of G. 

Lemma 3.9. Let ( > be fixed. Then there exists a constant such that, for all n ^ <pt , all deterministic 
and normalized v e C^, and all z e S(G,J 7 we have 

E(lmG vv (z))" sC (<p c <<S>(z)) n . (3.20) 

Proof. We shall prove (3.20) using Green function comparison to GOE/GUE. First we claim that (3.20) 
holds if H is a GOE/GUE matrix. Indeed, in that case, using unitary invariance, (3.5), and (3.16), we find 
for z e S(C C ) that 

E(lmG vv (z)) n = E(lmGnW)" «S (<p c <$(z)) n + N n N c exp(-^) , 

where in the last inequality we used the rough bound |Gn(z)| ^ rf 1 ^ N. Thus (3.20) for GOE/GUE 
follows from (3.5) and the estimate 

7V c "exp(-^ 2C ) sc G, 

valid for n ^ ip 1 *. 

From now on we work on the product space generated by the Wigner matrix H = {N~ 1 / 2 Wij)ij and 
the GOE/GUE matrix {N~ 1 / 2 Vij)i t j. We fix a bijective ordering map on the index set of the independent 
matrix elements, 

<j>:{{i,j):l^i^j^N} -> {l,..., 7 ma X } where 7max := jj ; ( 3 . 2 1) 

and denote by H 1 = (h]j), 7 = 0,... , 7 max , the Wigner matrix whose upper-triangular entries are defined 

by 

ij ' [N~ 1 / 2 V tJ otherwise. 

In particular, Ho is a GOE/GUE matrix and -ff 7max = H. 

Let denote the matrix whose matrix elements are given by E^p ■■= SikSji. Fix 7 > 1 and let (a, b) 

be determined by <p(a, b) — 7. We shall compare -ff 7 _i with H 7 for each 7 and then sum up the differences. 
Note that the matrices i? 7 _i and _ff 7 differ only in the entries (a, b) and (6, a), and they can be written as 

tf 7 _! = Q + N-^ 2 V where V := V ab E^ + l(a £ b)V ba E^ , (3.22) 

and 

H- i = Q + N~ 1 l 2 W where W := W ab E (ab ^ + l(a ^ b)W ba E^ ; 

here the matrix Q satisfies Q ab = Q ba = 0. 
Next, we introduce the Green functions 

R ■■= 77^, S := — ^ , T := — —, (3.23) 

Q — z W 7 _i — z H 7 — z 

which are well-defined for rj > since Q and H 1 are self-adjoint. Using the notation G 7 := [H 1 — z)^ 1 , we 
have the telescopic sum 

7max 

E(lmG 7 r x )"-E(lmG° v )" = ]T (E(lm G 7 V )" - E(lm G^ 1 )") . (3.24) 

7=1 
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For any K € N we have the resolvent expansions 

K-l K-l 

S= N- k/2 {-RV) k R + N- K / 2 (-RV) K S = N- k/2 R(-VR) k + N- R / 2 S(-VR) K (3.25) 

fc=0 k=0 

and 

K-l K-l 

R= N- k / 2 {SV) k S + N~ K / 2 {SV) K R = N- k / 2 S{VS) k +N- K ' 2 R{VS) K . (3.26) 

fc=0 fe=0 

Now we choose K = 10 in (3.26). Applying Theorem 3.6 to the Wigner matrix S, using the rough bound 
||-R|| ^ V 1 *s N to estimate the rest term in (3.26), and recalling (2.1), we find 

\R l0 -8 i3 m\ < \Sij -6ijm\ +ip c <N~ 1 / 2 ip c ^ (3.27) 

with 2£-high probability. Here we also used (3.5). Throughout the proof we shall tacitly make use of the 
bound \Rij \ < C with 2^-high probability, as follows from (3.27). 

Next, setting K = 1 in (3.25), recalling (2.1), and using Lemma 3.8, we find 



\S va -R va \ < N-y\ c <{\S va R ba \ + \S vb R aa \) < N- 1 ' 2 ^i^ ha ^+\v a \ + \v b \ 



(3.28) 



with 2^-high probability. Now (3.28), (3.5), and Lemma 3.8 yield 



\R va \ < ^ C < ] J Jm ^+C\v a \+ l p c <N-^ 2 < ^ ] j lm ^+^ + C\v a \ (3.29) 

with 2^-high probability. The same bound holds for R av . Similarly, choosing K = 1 in (3.25) yields, using 
(3.29), that 

\S VV -R VV \ < ^- 1/2 ^(l^vai? 6 v| + |S vfc i? av |) ^ N- 1 / 2 ^^^- + \v a \ 2 + \v b \ 2 ^j (3.30) 

with 2£-high probability. 

After these preparations, we may start to estimate 

n 

(lmS vv ) n - (lmi? vv )" = A m (lmR vv ) n - m , 

m—l 

where we defined 



^ (im S vv - Im R vv ) r 



We choose K = 4 in (3.25) and introduce the notation S — R — J2 k=1 Yk, whereby Yu has k factors V. We 
write 

4m / v 4 m 

A TO = ^ A mjfe where A mjfe := J l(fei H h k m = k) JJ Im(F fe .) vv . (3.31) 

fe=m ^ ' k 1 ,...,k m = l »=1 
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Thus we have 

n 4m 

E(lmS vv ) n -E(lmR vv ) n = A+Y, E EA ™,k (im Ry V ) n ~ m , (3.32) 

m=l fc=max{4,m} 

where A depends on the randomness only through Q and the first three moments of V a b- 
We shall prove that 

f:EE|^ fe |(lmi? vv )"- m < J^.(E(lmS vv ) n + ( V c ^r), (3-33) 

m=l fe=4 °^ 

where we defined 

2 

Sab ■■= E N- 2+ °' 2+T / 2 \vX\v b \ T . (3.34) 

cr,T = 

For future use, we note that the proof of (3.33) does not require the vanishing of the third moments of H 
as in (2.8). Before proving (3.33), we show how it implies (3.20). Let us abbreviate X^ := E(lmGJ v ) and 
£ 7 := (logiV) _1 £0-i( 7 ). Note that, since ImGJ v > 0, we have X 1 > for all 7. Repeating the derivation 
of (3.32) for T instead of S, using that the first three moments of V a b and W a b are the same, and using the 
estimate (3.33) and its analogue with S replaced by T, we find 

X 1 - X 7 _i < £ 1 (X 7 + A%_i + (<p c < $)") . 
Abbreviating r 7 := (1 — £ 7 ) _1 (1 + £ 7 ) ^ 1 we therefore find 

X 7 sc r 7 X 7 _i +r 7 £ J (ip c <$) n . 
Since (3.20) holds for GOE/GUE, we have the initial estimate X < (y> c <$)™. Iteration therefore yields 

Next, we observe that J2 7 £7 ^ 1- Since < £ 1 ^ 1/2, we find FJ 7 r 7 ^ C- This implies 

E(lmG vv )" = X 7max (<p°<*) n , 

which is (3.20). 

What remains is to prove (3.33). Recall that in (3.31), (Y k ) vv = N- k ' 2 [(- RV) k R] vv if fc < 4 and 
(^4)w = N~ 2 [(— RV) k S] . For each Yj^ in (3.31), we write out the matrix multiplication in terms of 
matrix elements of S, R, and V. Then we multiply everything out. We classify the resulting terms using 
two additional parameters s, t > 0. Here s is the total number of matrix elements R va , R av , S va , and S av ; 
t is defined similarly with a replaced by b. If a = b, we use the symmetric convention s = t. 

We have the conditions 

s + t = 2m, k ^ max{s,t}. (3.35) 

The first one is immediate. The second one is clearly true if a = b. In order to prove it in the case a 7^ 6, 
assume for definiteness that s > t. Then each factor i? va , R av , S va , and 5 av is associated with a unique 
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factor V a b or V ba (the one standing next to it in the matrix product); this proves the second condition of 
(3.35). Thus we have the decomposition 

k 

A m ,k = y~] l(s + t = 2m)An,fc,s,t > ( 3 - 36 ) 

s,t=0 

in self-explanatory notation. 

Using Lemma 3.8 and (3.29), we get the bound 

(\Rva\ + \Ra V \ + \S va \ + |^av|) S (|i?vh| + \Rbv\ + \S vb \ + |Sfcv|) 

(\ rn / \ m—s/2 

/ \ m—t/2 

+ L^^+^A (c\ Vb \r + (c\v a \y(c\v b \r 



)m 
(l + N s / 4 \v a \ s + iV*/ 4 |^|* + iV^+^I^H^I*) (3.37) 

with 2^-high probability, where in the second step we used Lemma 3.10 below and s + t ^ <pt, and in the 
third step the inequality x m ~ a y a (x + y) m . Here D > is some constant to be chosen later, and C\ t o 
denotes a constant depending on £ and D. For the following it will be convenient to abbreviate 

T ab (s,t) := l + N'/*\v a \' + N t '*\v b \ t + N>' 4 + t '*\v a \'\v b \ t . 

Using (3.4), (3.5), and Lemma 3.10 below, we find that there is a constant Cq^d such that for z <G S(C^d) 
we have 

(|J2va| + |#av| + \S va \ + |S av |) S (|i? vb | + \R bv \ + \S vb \ + \Sby\j* 

< ^- Dm ((lm5 vv ) m +(^ c <.-$) m )j- ah ( s ,t) (3.38) 

with 2£-high probability. 

Next, we observe that (3.30) and (3.5) imply 

Imi? vv sC (l + (p c <N- 1/2 )lmS vv + tp c ' c <5> (3.39) 

with 2£-high probability. Recall that, be definition, A m ^, s ,t contains k factors V, s factors in the set 
{R va , Rav, S va , S av }, and t factors in the set {R vb , R bv , S vb , S bv }. Therefore the definitions (3.31) and 
(3.36), as well as the estimates (2.1), (3.38), and (3.39), yield 

|A m , M , t |(Imi? vv )"- m 

< (Anr^N- k /^- Dm [(lmS vv ) m + (<p c ^<S>) m ) F ab (s,t) ((l + ^N~^) ImS vv + ^<&)"~ m 

< ^ c <- D )™N- k l 2 ({lmS vv ) n + (<p c ^$) n y ab (s,t) (3.40) 
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with 2£-high probability, where we used that k ^ 4m, that n ^ , (™) ^ n m , and Lemma 3.10 below. 
Denote by H the event on which the estimate (3.40) holds; thus, P(S C ) ^ N c exp(— <p 2Ci ). Using (2.1) and 
the deterministic bound ||i?|| + HSU < N, it is easy to see that on S c we have the rough estimate 

1 /2 

E|A m , M , t |(Imi? vv )"- m l(S c ) < N n (w.\A mM \ 2 ) V(E C )^ 2 (Np^ exp(-c^) < ^N^ 0n 
for all n < tp** and TV large enough. Therefore choosing D = large enough we get from (3.40) 

nA rnM \{\TnR vv ) n - m sc ^- m ((ImS vv )" + $)") iV^ 2 F ab (s,t) . 
Therefore (3.33) follows using (3.35) if we can prove that 

2 

Ar- max { 4 ' s '*}/ 2 (l + Ar s /> a | 5 + 7V*/> 6 |* + iV s/4+ * /4 K| s hr) < C5 o6 = C N- 2+a ^ 2+T / 2 \v a \ a \v b \ T . 

<r,r=0 

(3.41) 

for all s, t. We check that all terms on the left-hand side of (3.41) are bounded, for all s, t ^ 0, by the 
right-hand side of (3.41). The first term is trivial: N~ max { 4 ,M}/2 ^ N~ 2 . The second term is bounded by 

Ar -ma X {4, s ,t}/2 Ars /4| Ua | 5 ^ ^-2 + ^2+1/4^ + N -^\ v ^ . 

The third term is bounded similarly. Finally, the last term is bounded by 

7v - m a x{ 4, s , t }/2 ArS /4+ t /4 |ua|s|ufc|t ^ E + ^-2+1/2^11^1 + jy-2+1+1/4 ( | ^ 1 2 ^ | + ^| ^2) + | 1 2 1 „ ft , 2 _ 

where £ denotes a quantity bounded by the three previous terms. This completes the proof of (3.41), and 
hence of (3.33). □ 

What remains is to prove the following elementary result. 
Lemma 3.10. For x, y > and m € N we have 

(x + y) m s? Cx m + (my) m . 
Proof. By convexity of the function x i-> x m we have, for any A G (0, 1), 

(\m -. -. 

v ; 1-A \J (1-A) m A™ y 



Choosing A = 1 /m yields the claim. □ 
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3.4. Estimate of G vv — m. We now conclude the proof of Proposition 3.1. By polarization and linearity, it 
is enough to prove the following result. 

Lemma 3.11. Let ( > be fixed. Then there exists a constant such that, for all n ^ tpt , all deterministic 
and normalized v e C^, and all z G S(C^), we have 



E\G vv {z) - m(z)\ n s? (p c < *(*))". 



(3.42) 



Proof. The proof is very similar to that of Lemma 3.9, whose notation we take over without further 
comment. In order to avoid dealing with complex numbers, we estimate the real and imaginary parts of 
G vv — m separately. We give the argument for the real part; the imaginary part is dealt with in the same 
way. Throughout the following n denotes an even number less than <jfi. 

For the GOE/GUE matrix H we get from Theorem 3.6, as in the proof of Lemma 3.9, that 



E(ReG° v -Rcm)" < (<p c <9) n . 
In order to perform the comparison step, we write, similarly to (3.32), 

n 4m 

E(Re5' vv - Rem)™ -E(Re.R vv -Rem)" = B+ ^ ^ EB m , k (Re i? vv - Re m)"~ 

m=l fc=max{4,m} 

where B depends on the randomness only through Q and the first three moments of V a b, and 

4 



(3.43) 



B 



m . k 



(V <± lit 

n ) ]T + ■ ■ ■ + k m = k) J] Re(r fei ; 
mJ k u ...,k m =l i=l 



Similarly to (3.33), we shall prove that 

n 4m 



B m k | Rc R V v — R© tn 



logiV 



(Re S vv - Rem)' 



Im S v - 

Ni] 



m— 1 k — 4 

Using Lemma 3.9, (3.4), and (3.5) we find that the right-hand side of (3.44) is bounded by 

Sab 



(3.44) 



logiV 



- E 



(ReS* vv - Rem)" + (tp c <^ n 



Therefore (3.43) and (3.44) yield (3.42), exactly as in the paragraph following (3.34). 

What remains therefore is to prove (3.44). Using (3.37), (3.5), and Lemma 3.10 we get, for arbitrary 
D > 0, 



(Ka| + |i2ov| + |Sva| + \Sa V \) S (\R vb \ + \Rbv\ + \S vb \ + fiby^ 



Im S v 

Nn 



(^■ D *) m LF o6 («,t) (3.45) 
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with 2£-high probability. Therefore we get, similarly to (3.40), 
|-B TOi fe iSit ||Reii vv — Rem | 

< V (C ( -D )mN - k/ 2^ ReSvv _ Rem ^ + (^^ 

with 2£-high probability, where we used (3.30), N^ 1 / 2 ^ and Lemma 3.10. Choosing D > large enough 
and recalling (3.41) yields (3.44). (We omit the details of the analysis on the low-probability event, which 
are similar to those following (3.40).) This concludes the proof of Lemma 3.11. □ 



4. Proof of Theorem 2.2, Case B 

In this section we prove Theorem 2.2 in the case B, i.e. we impose no condition on the third moments of the 
entries of H, and ^(z) satisfies (2.9). By Markov's inequality, it suffices to prove the following result. 

Proposition 4.1. Fix ( > 0. Then there are constants Co and Cq, both depending on (, such that the 
following holds. Assume that z e S(C^) satisfies (2.9) with constant Co- Then we have, for all n ^ ip** and 
all deterministic v, w e C^, that 

E|G vw (z) - (v,w)m(z)|" < (¥> c< *(z)||v||||w||) n . (4.1) 

The rest of this section is devoted to the proof of Proposition 4.1. We take over the notation of Section 
3, which we use throughout this section without further comment. 

4.1. Estimate of ImG vv . In this section we derive an apriori bound on ImG vv by proving the following 
result. 

Lemma 4.2. Fix ( > 0. Then there are large enough constants Cq and C^, both depending on Q, such that 
the following holds. Assume that z € S(C^) satisfies (2.9) with constant Co- Then we have, for all n < ip^ 
and all deterministic and normalized v G C N , that 

E(lmG vv (z))" sC ((/^(z))". (4.2) 

The following (trivial) observation will be needed in the next section: The constant Co may be increased 
at will without changing Cq in (4.2). 

The main technical estimate behind the proof of Lemma 4.2 is the following lemma. Recall the setup 
(3.21) of the Green function comparison, and in particular the definitions (3.23). 

Lemma 4.3. Fix £ > 0. Then there are constants Co and C\, both depending on £, such that if (2.9) holds 
with constant Co then we have the following. For any a, b we have 



n 4m 

Yl KA m . k (lmR vv y 

>n — 1 fc-max{3,m} 



1.1.1 (l .v ^ ■ ) . I I I f 

where 

£ ab ■= £ ab + S ab (\v a \ 2 + N-^) = ^ 2+CT/2+T/ Vl> 6 | T + MM 2 + ^ 3/2 )- 



CT.T=0 
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Moreover, if 



then we have the stronger bound 

n 4m 



Ci 



m—1 /c-max{3,m} 



c 



logiV 

Before proving Lemma 4.3, we use it to complete the proof of Lemma 4.2 
Proof of Lemma 4.2. Let B c {1, . . . ,N} 2 denote the subset 



B 



(a,b) : \v a \ + \v b \ > N~ 



-1/4 

V Nrj 



Since ||v|| = 1, the number of indices a such that \v a \ > e is bounded by e 2 . Therefore 



Therefore we have 



E 



c 



logiV 



£ h + N- 3/2 — 
tab + Nn 



E 



c ~ c 



log TV 



(a,b)GB s 17 (a,b)<£B 

Now (4.2) follows from (4.3) and (4.5), by repeating the argument after (3.34). 

Before proving Lemma 4.3, we record the following lower bound on n. 
Lemma 4.4. Let C > 0. // (2.9) holds then 

n > ^/ 3 AT- 5 / 6 . 

PROOF. The claim follows immediately from {Nn)^ 1 ^ * ^ ^-^^iV -1 / 6 . 



logiV 



(4.4) 



£ a6 (E(lmS vv )" + (<^ <£)"). (4.5) 



□ 



(4.6) 

□ 



Proof of Lemma 4.3. Note that the proof of (3.33) did not use the assumption (2.8). In particular, all 
statements in the proof of Lemma 3.9 after (3.35) remain true in the case B. By (3.33), it is enough to prove 

EA m , 3 (Imi? vv )"- m | sC l ^^ Qb + iV- 3 / 2 ^-^(E(lm5 vv )" + (^<i>) n ) (4.7) 



for m = 1,2, 3 as well as, assuming (4.4), 



EA m . 3 (lmR vv ) n - m \ sC -L-£ ab (E(lmS vv ) n + (<p c <$) n ) 



(4.8) 



for to = 1,2,3. In order to prove (4.7) and (4.8), we distinguish four cases depending on m and whether 
a = b. Recall from (3.35) that 

s + t = 2m, s<3, t sC 3. (4.9) 
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Case (i): a = b and m < 3. Similarly to (3.37), we find 

{\Rva\ + \Rav\) < <p- Dm (lmS vv + <p C <°<I>) (l + N m / 2 \v a \ 2m ) 

with 2£-high probability, for any constant D > and z e S(C^ j £>). Therefore (3.39) yields 

\A m , 3 \(ImR vv ) n - m < ^- Dm 7V- 3 / 2 (lm5 vv + ^-$)"(l + iV m /> a | 2m ) 

^- 1 (lm5 vv + ^*) n (JV- 3 / 2 + |t; | 2 ) 

with 2^-high probability, where we used that 1 < m < 3. Therefore Lemma 3.10 yields 

E|A m;3 |(Imi? vv )"-™ sC C^- 1 (E(lm5 vv )" + (^$)")(7V- 3 / 2 + | Wa | 2 ), (4.10) 

which is (4.8). In particular, we have also proved (4.7). Here we omit the details of the estimate on the 
event of low probability, which are analogous to those following (3.40). 

Case (ii): a ^ b and m = 3. By (4.9), we have s = t = 3. From (3.37) we get 

(\ m / \ m—s/2 

(\ m-t/2 
^ J ^+^A (c\v b \r+(c\v a \nc\v b \r (4.H) 

with 2C-high probability. Together with (3.4) and (3.39), this yields 
(|i?va| + |i?av|) S (|i?v b | + |i?bv|) t (Inii? vv )"- m s? (im 5 VV + <p c <-° $)" 

(€l\ m + ( ^y /2 ( in D^-t/2 u ^ t + f ^Y^ar'/Vi- + r^*r'/ 2 -*/ 2 i«j«i«j*l (4.12) 



with 2£-high probability and for any D > 0. Choosing D and Co in (2.9) large enough, we get from (2.1), 
(4.6), Lemma 3.10, and iV" 1 / 2 $ that 

|A 3 , 3 |(Imi? vv )"- 3 < ^N-^[(lmS vv r + (v> C <*) n ) (V 1 / 2 + N^ 2 \v b \ 2 + N^ 2 \v a \ 2 + N 3 / 2 \v a \ 2 \v b \ 2 ) 

with 2£-high probability. Now (4.8), and hence also (4.7), follows easily (we omit the details of the analysis 
on the low-probability event). 

Case (iii): a ^ b and m = 2. Consider first the case s = t = 2. Then A 2 ^^,2 (see (3.36) and (3.31)) is a 
finite sum of O(l) terms of the form 

X\ := R va h ab R bv R va h ab R ba h ab R bv . (4-13) 

(The other terms can be obtained from (4.13) by permutation of indices and complex conjugation of factors.) 
We shall estimate the contribution of X x ; the other terms are dealt with in exactly the same way. Note 
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the presence of an off-diagonal resolvent matrix element Rb a , as required by the condition s — t = 2. From 
(3.27) and (4.12) we get, with m = s = t = 2, that 



iXxKlmi?™)™- 2 sC ip c < VN- 3 / 2 (lmS vv + v CcD <S>) n 



<£2 



with 2(- high probability. Note the factor VP arising from the estimate of Rb a . Choosing D and C large 
enough, and recalling (2.9), we find using Lemma 3.10 that 



iXxKlmi?™)"- 2 sC ^^((ImSwr + ^^j^ 



with 2^-high probability. This yields (4.8) and hence also (4.7). 

Let us therefore consider the case s = 3 and t = 1. (The case s = 1 and t = 3 is estimated in the same 
way.) Using the bounds $ > (Nt])^ 1 and $ ^ iV -1 / 2 , wc find 



1^2,3,3,1 1 (IrniT-w)"- 2 

< ^iV- 3 / 2 (lmS vv + ^$)" 



(4.14) 



C c \ 3/2 



Nr] 



(^$)- 3/2 kl 2 + K<f>)^kl 2 K| 



iv- 3 / 2 ^ + 7V- 3 / 2 w + Ar-> a | 2 + tv- 1 / 2 !^! 2 !^! 



(4.15) 



with 2^-high probability, for D and Co large enough. This yields (4.7) in the case s — 3 and t = 1. 

In order to prove the stronger bound (4.8) in the case s = 3 and f = 1, we note that (3.29), (3.4), (3.5), 
and the assumption (4.4) yield 



<C if 



C( J lmS YV +$ 
Nrj 



(4.16) 



The same bound holds for i? av , R v b, and Rb v . Now ^2,3,3,1 is a finite sum of 0(1) terms of the form 

X2 '■ = R va habRbv RvahabRbbhbaRav ■ 

(Again, the other terms can be obtained from X 2 by permutation of indices and complex conjugation of 
factors.) We shall show that 

|EX 2 (Imi? vv )"- 2 | < Cp- 1 £ o6 (E(ImS vv ) n + ( ¥ > c <$) n ). (4.17) 

We split Rbb = {Rbb — m) + m in the definition of X 2 - The first resulting term is estimated, using (3.27), by 

^ -qj N- 3 / 2 \R va R bv R va R av \(ImR vv ) n - 2 . 

The estimate of |X L |(ImS' V v)™~ 2 above may now be applied verbatim. What remains is the second term 
resulting from the above splitting of X 2 . Since \m\ < C and h ab is independent of R, we therefore have to 
show that 

CN- 3 / 2 \ER va R bv R va R av (lmR vv ) n - 2 \ ^ ( p- 1 £ ab (E(hnS^) n + (p c <*) n ) . (4.18) 
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Using (3.7), wc expand 
where we denned (see also (3.9)) 



Rbv = mTZb v + TZ' bv . 



K bv := - ^ h bkRkl i K'bv :== v b R bb + (Rbb - m)TZ bv 



(4.19) 



(4.20) 



Now we observe that, using the bound (3.27), we may repeat the proof of Lemma 3.8 to the letter to find 
that its statement holds with (G, Q) replaced with (R,1Z). Thus wc find 



Urn R v 



+ C\v b \ < <p c < 



/ImS'vv + * 



I Im S vv + $ 

mi 



(4.21) 



with 2(-high probability, where in the second step we used (3.39) and (4.4), and in the last step (3.5). Using 
(3.27), (4.4), and $ ^ (iVr?) -1 , we therefore find 



\K' bv \ < ^-^+ ¥ ,- D JV- 1 /*j(ImS ¥V + ^$) 1 / J 



(4.22) 



with 2C-high probability, for any D > 0. Therefore (3.39) and (4.16) yield 



CN' 3 / 2 ¥.R va n' bv R va R av {lmR vv ) n - 2 < N-*l 2 <p c <{Nri)- 



3/2 



with 2^-high probability. Using (2.9), (4.6), and Lemma 3.10, we find that the right-hand side is bounded 

by 

1 JV- 2 ((Im5 vv ) r ' + (^$)' 



with 2(- high probability. Combined with the usual estimate on the complementary low-probability event, 
this concludes the estimate of the 7\Lj v -term. What remains is to prove that 



CN -m 



ER X 



JL bw R wa R aw (Im i? vv )"- 2 1 ^ if^Eab (E(Im S vv ) n + (<p c < $)") , (4.23) 



The key observation behind the estimate of (4.23) is that E b lZ bv = 0, where E b denotes partial expectation 
with respect to the &-th column of Q. Thus we have 



K 



bv ■ 



ER va K bv R va Ra V (Imi? vv )"- 2 = E\R va R va R av (Imi? vv )"- 2 - R^R^R^ (ImR^)"- 2 
In order to compare the quantities in the brackets, we use (3.6), (3.27), and (4.16) to get 

Ka = R^l + = R^l + 0(^R vb ) , (4.24) 

(4.25) 



R 



/?vv = rW + ^hl = flW + o^cc ImS - + <& ) 



Rbb 
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with 2£-high probability. In particular, we get from (3.39) and (4.16) that 



ImiZW < (l + ip -<)ImS vv + l p c <Z, \R%\ < Z ^^^ (4-26) 

with 2^-high probability, for z € S(C£) with some large enough C£. A telescopic estimate of the form 

k k k /j-i \ I k \ 

n^+^-n^ = e ii 1 * » n fa+vi)) 

i=l i=l j=l \»=1 / \»=j+l / 

therefore gives 



CN~ 3/2 



R va R va R av (Imi? vv )"- 2 - R^Ri^R^ (Imi?^) 

3/2 



IZbv 



< ^7V- 3 / 2 ]^ v | ( k y $ ) *(lmS vv + ^«I>) 



/lmS vv + $ / * 1 V _ c *„. 



1/2 



with 2^-high probability, where in the last step we used (4.21) and n < tpt. Now (4.23) follows easily for 
large enough Co in (2.9), using (2.9) and (4.6). This concludes the proof of (4.18) and hence of (4.17). 

Case (iv): a ^ b and m = 1. Similarly to (4.15), one easily finds the weak bound (4.7). Let us therefore 
assume (4.4) and prove (4.8). It suffices to prove that 

iV-^lEA^ImiJw)"- 1 ! < ^- 1 AT- 2 (E(lm^ vv )" + (^$)™) , (4.27) 

where X 3 stands for any of the following expressions: 

RvaRbaRbaRbv , RvaRbbRabRav , RvaRbbRaaRbv • 

Here we used that h a b and hb a arc independent of R. (Up to an immaterial renaming of indices and 
complex conjugation, all terms in are covered by one of these three cases.) Applying the splittings 
R aa = m + (R aa — m) and Rbb = m + (Rbb — to), we find that it suffices to prove (4.27) for A 3 being any of 

RvaRbaRbaRbv, Rva(Rbb — TTl) R a bRav , Rva(Rbb — fn)(R aa — m)i?fc v , 

RvaRabRav , Rva{Rbb — m)R bv , R- va (R aa — m)R bv , 

RvaRbv • 

Next, applying the splitting (4.19) to the last line, we find that it suffices to prove (4.27) for X 3 being any of 

RvaRbaRbaRbv, Rva{Rbb — TTl) RabRav , Rva(Rbb ~ m)(R aa ~ m)Rb v , lZ' va lZ' bv , (4.28a) 

RvaRabRav, Rva(Rbb — TTl) Rbv , Rva(Raa ~ m) Rbv , T^-'va^-bv , 7Z va lZ bv , (4.28b) 

n va TZ b v ■ (4.28c) 
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For X 3 in (4.28a), we find from (3.27), (4.16), and (4.22) that 

|*s| *S ^(^ + ^ D iV- 1/2 )(lm5 vv + ^<i>) 



with 2^-high probability, from which (4.27) easily follows using (2.9), (4.6), (3.39), and Lemma 3.10, having 
chosen D and Co in (2.9) large enough. 

Let us now consider X 3 = R va R ab R av . Using (3.7), we split, similarly to (4.19), 

(*>) 

R ab = m,K ab + {R bb - m)K ab , K ab := -^R a kh k b- 



Using (3.12), (3.4), (3.6), and (3.27), we find 

(6) \ 1/2 



1 



1/2 



c, (T)" !</< » <4 - 29) 



with 2^-high probability. For the second part of X 3 resulting from the splitting of R ab , we therefore get the 
estimate 

\R va (R bb - m)K ab R av \ < — (lmS vv + ^$) < ^ N~^ 2 (im S vv + ip c ^) 
with 2^-high probability. For the first part, we use E b lZ ab to write 



ER va TZ ab R av (lm R vv ) 



n-l 



E 



'RyaRvvQmRw)"- 1 - R^R^ilmR^r- 1 



b ■ 



This may be estimated using a telescopic sum, exactly as (4.8); we omit the details. This completes the 
proof of (4.27) in the case X 3 = R va R ab R av . The second and third terms of (4.28b) are estimated similarly. 
For the choice X 3 = lZ va lZ' bv , we use E a lZ va = to write 



EK va ll bv (lmRv 



where we defined 



(K v ) (a) ■= v b R^ + {R^-m)nt 



\n-l 



?(«) 



E 



ft' bv (Imi? v 



\n-l 



(iZ^dmR^r- 1 TZ va , 



(4.30) 



(a) 



,(») 



'Hw - 



(ab) 

hbkR 

k 



(ab) 
kv 



We find 



(ab) 

k 



(6) _ ^? (a6)^ 
kv ) 



/ImS vv + $ 



1/2 



(*>) 1 2 



< ^^=(lm^ vv + $) 1/2 (4.31) 
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with 2C-high probability, where we used (4.26), (2.1), (3.12), (3.6), and (4.24). Together with (3.6), (3.27), 
(4.16), and (4.4), we therefore find 

|(7C) (a) (M + \H bv \)\R$ -R bb \ + \R^ m |c^-^=(lmS vv + rf/* 

T 2 

< ^^==(lmS vv + $) 1/2 (4.32) 
V Nrf 

with 2^-high probability. Recalling (4.21), (4.25), Lemma 3.10, and the usual rough estimate on the com- 
plementary low-probability event, a telescopic estimate in (4.30) therefore gives 

E^ va K v (Inii?w)- 1 < ^ C (^+(^)( E ( ImS -) n + (/ < *) B )- 
Now (4.27) follows. 

Now we prove (4.27) for X 3 as in (4.28c). We begin with a graded expansion of R vv . Using (3.8) we find 

p p p( a ) P?( a ) P P 

P - - E>( a ) i Av " n '' Y _ p(o6) , n-vb n bv , "va-^av 
JX V v ^vv ' t-> ^vv ' r„> 



^ vv Rif R, 



We deal with the last term by applying (3.6) twice, followed by 

1 1 RabRba 



R ™ R { al RaaRbbR { al ' 

itself an immediate consequence of (3.6). This gives the graded expansion 

Rvv ^v V ^ ~t~ ^vv ^vv ^vv ' 

where 

p (o) p (a) p(6) p 

p[a6] ._ p(ab) p[a] ^vb -"by p[b] ._ n " 

n vv ' "tv i "tv ' n( a ) ' vv • ((,) 

-f? bfc -Raa 

^[0] RvbRbaRav ^ Rva RabRbv RvaRav RabRba 

VV ~ RaaRbb RaaRbb R aa R bb R a b } 

:nt of the 

(4.16), and (4.26), we have 



[Hi 

Note that i? v is independent of the columns of H indexed by T. Moreover, by (3.27), Lemma 3.2, (3.6), 



with 2£-high probability. Thus we write 

n-1 

E7e va ft 6v (Im.R vv )"- 1 - ^EftvaftftvIJlmi^ (4-34) 

A i=l 
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where A = (Aj)" =1 and Ai G {®,a,b,ab} for i = 1, . . . , n — 1. In order to keep track of the terms in the 
summation over A, we introduce the counting functions 

n— 1 n— 1 

n(A) := = a) + = 6)) , r 2 (A) := ]T = 0) . 

i=l i=l 

We partition the sum in (4.34) as 

^ - ]T 1 (r 2 (A) = 0) 1 (n ( A) = 0) + ^ 1 (r 2 (A) = 0) 1 (n ( A) = l) + £ 1 (r 2 (A) > 1 or n (A) ^ 2) . 

A A A A 

(4.35) 

Let us concentrate on the first summand; its condition is equivalent to Ai = ab for all i. Using E a lZ va = 
and E b (TZ bv - TZ^) = we get 

E^ va 7e bv (imi?ifr _1 = E(7e va -7eW)(7e fcv -^ ) )(inii?l a v h1 )"- 1 . 

From (4.31), (4.33), (3.39), and Lemma 3.10 we therefore get 

lEftv^vtlmi^l)"- 1 ! <p c <^-(E(ImS vv ) n + (<p c <$) n ) <^ ^N' 1 / 2 (E(lm S vv ) n + ( V C <$) n ) 

for large enough Co- 

The second summand of (4.35) consists of n terms of the form 

En va TZ bv (lmR^l)(Im^) n - 2 = EK va (K bv - ft£>)(ImijM )(Imi?[ a v b1 )"" 2 • 
Recalling (4.33), we estimate this as above by 



^^^(E(Im5 vv )" + (^ c <$)") < ^- 1 iV- 1 / 2 (E(Im5 vv )" + (^$)") 



for large enough Co- 

What remains is to estimate the third summand in (4.35). From (4.33) and (4.31) we get 

71-1 

]Tl(r 2 (A) > 1 or n(A) > 2) \TZ va TZ bv \ J] (imi?^ 1 1 

A i=l 

( ^ )8 ' (A^)( Im ^ + ^)" < V - 1 JV- 1 / a ((ImS ¥V )" + (^4)") 



with 2^-high probability. This completes the proof of (4.27) for X 3 = lZ va 7l bv . □ 
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4.2. Estimate of G vv — to. We now conclude the proof of Proposition 4.1. By polarization and linearity, it 
is enough to prove the following result. 

Lemma 4.5. Fix £ > 0. Then there are constants Co and Cq, both depending on Q, such that the following 
holds. Assume that z G S(Cf) satisfies (2.9) with constant Cq. Then we have, for all n < (p^ and all 
deterministic and normalized v <G C N , that 



E\G vv (z) - m{z)\ n < (<p c <*(z)) n . 



(4.36) 



Proof. As in the proof of Lemma 3.11, we focus on Re G vv — Rem. Assume without loss of generality that 
n is even. We shall prove that 



E 



(5 m , 3 (Rei? vv -Rem)" m ) | (s ab + N~ 3 / V 1 * 



E(Re S vv - Re m) n + {<p c <qy 



for to = 1,2,3 as well as, assuming 



kl + M < at 1 / 4 / 7172 ^, 



that 



E 



(yB m ^ (Re i? vv — Re to) ' 



1 



log AT 



E (Re S* vv - Re to) " + (^ c < tf) n 



(4.37) 



(4.38) 



(4.39) 



for to = 1, 2,3. Here C\ is a large enough constant depending on (. 

Assuming that (4.37) and (4.39) have been proved, we get the claim (4.36) from (3.44) and Lemma 4.3 
applied to S; the detains are identical to those of the proof of Lemma 4.2 and the argument following (3.34). 

The proof of (4.37) and (4.39) is similar to the proof of (4.7) and (4.8). The key input is the apriori 
bound 

ImS vv s$ <p c <$ (4.40) 

with 2£-high probability, which follows from (4.2) and Markov's inequality. Throughout the proof, we shall 
consistently (and without further mention) make use of the inequality 



* m \RcR vv — Re to 



Re S vv — Re m I + 



which follows from the elementary inequality x m y n m < x n + y n for x, y ^ 0, Lemma 3.10, and the estimate 

\R VV - S vv \ sC tp c <V 

with 2£-high probability (as follows from (3.30)). Moreover, as in (4.16), we find that (4.38) implies 

\Rva\ < ¥> C ^- (4-41) 

The same bound holds for R av , R v b, and Rb v . 

As in the proof of Lemma 4.3, we consider four cases. 

Case (i): a = b and to < 3. This is easily dealt with using (3.45); we omit further details. 
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Case (ii): a 7^ b and m = 3. Recall that in this case we have t = s = 3. From (4.11) we get 

3 / X 3 



(\R V a\ + \Ra V \) (\Rvb\ + \Rbv\) 

< If' 



ImS'vv i t2 



3/2 



\ 3/2 

Im^w , ,t,2 I 1 2 



+ +kl 2 + KI 2 + *- 3 KI 2 K' 2 



with 2£-high probability. Therefore using (4.40), (3.4), and * > cN' 1 / 2 we get 
|S3,3||Rei?w - Rcm|"~ 3 < ^jy" 3 / 2 * 3 !* 3 + \v a \ 2 + \v b \ 2 + 7V 3/ > |> 6 | 2 



Re i?vv — Re to 



n— 3 



^- D ((Rei?w - Rcto)" + {^ D ^) n y ab 



with 2^-high probability, where in the last step we used (2.9). Choosing D large enough yields (4.39), and 
hence also (4.37). 

Case (iii): a^b and m = 2. In the case s = t = 2, the estimate is similar to the estimate of X\ in (4.13). 
Using (4.40), (3.4), and * > cA^ 1 / 2 we get 

\Xi\ < y c ^&^ 2 + \v a \ 2 + \v b \ 2 + N\v a \ 2 \v b \ 2 ) 

with 2^-high probability, from which (4.39), and hence also (4.37), easily follows. 

Next, consider the case s = 3 and t = 1. In order to prove (4.37), we estimate using (4.40) and (3.29), 
similarly to (4.15), 



|B2,3,3,ll «S V C <N- 3 / 2 (H> + \v a \) 3 (* + \v b \) 



TV-3/2^ + iV- 3 / 2 |^ fo | + N-^ 2 + N-^ 2 \v a \ 2 \v b \ 



with 2^- high probability from which (4.37) follows. Let us therefore prove (4.39), assuming (4.38). Using 
(4.41) and (4.40), we find 

\Rva\ + \Rav\ + \Rvb\ + \Rbv\ < ^ * (4-42) 



with 2^-high probability. We need to prove that 



AT-3/2 



^R-v aRbbRav Rav Rbv (Rc R V v — Re TO 



n-2 



< <fi 1 £ab 



E(RcS vv - Rcto)" + {<p c <*) n 



(4.43) 



As for (4.18), by splitting R bb = (R bb — to) + to and using (3.27), we find that it is enough to prove 



AT-3/2 



^RvaRav Rav Rbv (Re i? V v — Rs TO.) 



n-2 



f £ab 



E(RcS vv - Rcto)™ + (<p c <*) n 



As for (4.18), we use the splitting (4.19). Using (3.27), (4.40), and (4.6), we find that the bounds 



\K bv \ ^ <fi C <* «S ^N- 1 / 6 , \W bv \ < / 



+ 



* 2 ) < ip c <N- 1/3 



(4.44) 



(4.45) 



34 



hold with 2£-high probability. Thus we get (4.44) with R bv replaced with lZ' bv . The remaining term with 
IZbv is estimated exactly as (4.23); we omit the details. 

Case (iv): a^i) and m = 1. In order to prove (4.37), we use (4.40) to get 

|£i, 3 | < / 7c *(* + M + M + *~>a|M+*~> a | 2 + *~ 1 M 2 ) 

with 2£-high probability, from which (4.37) easily follows using ^ ^ N^ 1 / 2 . 

As for (4.27), in order to prove (4.37) and (4.39) it suffices to prove the following claim. For X% being 
any expression in (4.28a) - (4.28c), we have 

iV- 3 / 2 |EX3(Rci? vv -Rcm)"~ 1 | < tp- 1 (s ab + N- 3 ' 2 <p c <*j (e(RcS' vv - Rem)™ + (p c <tf) n ) , (4.46) 

as well as, assuming (4.38), 

iV- 3 / 2 |EX 3 (Rei? vv -Rem)"- 1 | < y^ 1 ^ (e(Rc S vv - Rem)™ + (p c <tf) n ) . (4.47) 

Note that from (4.20) and (4.40) we get that 

\K' bv \ sC C\v b \+if c <^ 2 . (4.48) 

If X 3 is any expression in (4.28a), we get from Lemma 3.8, (3.27), (4.40), and (4.48) that 

\X 3 \ s$ ^ c ^ 2 (y + \v a \)(y + \v b \)+^(y 2 + \v a \)(y 2 + \v b \) 

with 2^-high probability. Now (4.47), and in particular (4.46), follows easily (note that we did not assume 
(4.38)). 

Next, let X 3 be an expression in (4.28b). From Lemma 3.8, (3.27), (4.40), and (4.48) we get 
\X 3 \ ^ ^*(* + \v a \) (* + | Wo | + \v b \) + <p°< (f 2 + \v a \) (* + \v b \) + <p c < (* + |« |) (* 2 + |« 6 |) 

with 2£-high probability. Now (4.46) follows easily. Moreover, (4.47) under the assumption (4.38) follows 

exactly like in paragraphs of (4.29) and (4.30), using the bound \ (K' bv ) {a) - K' bv \ sC ip c <^ 3 with 2C-high 
probability, as follows from (4.32) and (4.40). 

Finally, we consider the case (4.28c), i.e. X 3 = lZ va lZ bv . Under the assumption (4.38), we find from 
(4.40), (4.33), and (4.31), 

l^-ftbvl < /^ 2 , \R [ :l\ + \B$l\ < <p°<* 2 , |i?l i| < <p c <* 3 

with 2^-high probability. Then the argument from the proof of Lemma 4.2 can be applied almost unchanged, 
and we get (4.47) assuming (4.38). □ 

5. Proof of Theorems 2.3 and 2.5 

By Lemma 3.2, if r\ ^ n and \E\ > 2 then the control parameter on the right-hand side of (2.10) can also be 
expressed as 

^xrt^, (5.1) 

where n=n E was defined in (3.2). 
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Proof of Theorem 2.3. By polarization and linearity, it is enough to prove that 

\G vv (z)-m(z)\ < V C <^ffi (5-2) 

with C _ high probability, for all normalized v. Moreover, by symmetry it suffices to consider the case 2 + 
^£^5; j n particular, n ^ Lp Cl N^ 2 ^ 3 . Using Lemma 3.2 we find that Theorem 2.2 implies 
(5.2) if 77 ^ r)o, where we defined 

770 := N^K 1 / 4 . 

Note that 770 < k. 

It remains therefore to establish (5.2) when < i] < r/ . Define 

z := E + i?7 , z Q := E + irj . 

By (5.1) and (5.2) at z , it is enough to prove that 

\m(z) - m{z )\ sc CN- 1 ' 2 ^ 1 ^ (5.3) 

and 

|G vv (z)-G vv (z )| < ip^N- 1 / 2 ^ 1 / 4 (5.4) 



with C-high probability. 

Differentiating (2.5), we find 



m 2 



(5.5) 



1 — m 2 

which, by Lemma 3.2, implies that m' x (k + i])^ 1 ! 2 = 0(k^ 1 ^ 2 ). Therefore we get 

\m{z)-m{z )\ Ck- 1 ' 2 ^ = GA^ 1 / 2 ^ 1 / 4 , 

which is (5.3). 

Next, by Theorem 3.7 we have E ^ X N + Vo w hh C-high probability provided C\ is large enough. 
Therefore, since rj ^ i] a ^ E — \ N ^ E — \ a with C-high probability for all a < N, we get 

ImG vv (z) - ^ {E _ Xa)2+i]2 < ^ {E _ Xa)2+v 2 ~ 2ImG vv (z ) < <p <N ( 5 - 6 ) 

with C-high probability, by (5.2) at z and the estimate Imm(z ) ^ GiV -1 / 2 /^ 1 / 4 . Finally, we estimate the 
real part from 

iReG (z) RcG ( Z M V (£ - A^fa 2 - r, 2 )|( U (") , v) | 2 
|ReG vv (,)-ReG vv U)| - ^ p _ Aq)2 + „ 2) p _ Aq)2 + ^ 

?7o r7 |(u( a ),v)| 2 , , , . 

< £^ v Z. (jE ;_ Aa )2 + % 2 < ImG vv (, ) (5.7) 

with C-high probability, where in the last step we used that rj < E — Ajy. Combining (5.6) and (5.7) 
completes the proof of (5.4). □ 
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Proof of Theorem 2.5. We begin with (2.14), whose proof is immediate. Using Theorem 2.2 with Con- 
dition A and Remark 2.4, we find 

C > ImG vv (A Q +ir / ) = ]T ^ 2 > ry-^u^ , v)| 2 

with C"high probability, where we used Theorem 3.7 to ensure that X a e [— S, S] with C-high probability. 
Choosing r\ = (p^N^ 1 yields (2.14). 
In order to prove (2.13), we set 

V : ="fb-Ja, E := 7 a , 

where j a is the classical location of the a-th eigenvalue defined in (3.17). Then we get 

£|(u(«),v)| 2 < ^E (f -^a+'l < ^Im^E + i,), (5.8) 

a— a a— a ^ a ' 

where in the first step we used Theorem 3.7 to conclude that (A Q — E) 2 Lp C( ri 2 for a ^ a ^ b. In order 
to invoke Theorem 2.2 with Condition B, we have to satisfy (2.9). Recalling Lemma 3.2, we find that (2.9) 
holds provided that 

n > <p c °N- 5 / 6 , k ip- 2C " V 2 N 4 ^, (5.9) 
where we abbreviated k = Ke- From (3.17) we get 

la + 2 X ^2/3^-2/3 (510) 

for a ^ N/2, from which we deduce, recalling E = 7 a , 

K x a 2 / 3 AT 2 / 3 , r, x (6 2 / 3 - a 2 / 3 )^" 2 / 3 . 
Hence (5.9) is satisfies provided that 

6 2/3_ a 2/3 > ip C 0N -l/6 + ip C 0a l/3 N -l/3_ 

Since 6 2 / 3 - a 2 / 3 ^ b^ 1 / 3 ^ - a)/2, we find that (5.9), and hence (2.9), holds under the condition (2.12). 
Therefore we may apply Theorem 2.2 to the right-hand side of (5.8) to get 

^|(u( Q ),v)| 2 sC ^^ + Im m (£ + i^ < ^7V- 1 ((6 2 / 3 -a 2 / 3 ) 3 / 2 + a 1 / 3 (6 2 / 3 -a 2 / 3 )) 

with C"high probability where we used Lemma 3.2. The claim now follows from the elementary inequalities 

03_ a 2/3 ^ ( 6 _ a )2/3 f 6 2/3_ a 2/3 ^ ffl -l/3 (6 _ fl) _ q 

For future use, we record the following consequence of Theorem 2.5 which is useful in combination with 
dyadic decompositions. For any integer K < N/A we have 



IK 



^|( U ( a ),v)| 2 s$ ipC^KN- 1 (5.11) 



a=K 



with C"high probability. 
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6. Eigenvalue locations: proof of Theorem 2.7 



6.1. Basic facts from linear algebra. We begin by collecting a few well-known tools from linear algebra, on 
which our analysis of the deformed spectrum relies. 

We use the following representation of the eigenvalues of H, which was already used in several papers on 
finite-rank deformations of random matrices [5-7, 32] . 

Lemma 6.1. If /i£M\ a(H) and det(£>) ^ then (i e cr(H) if and only if 

dct(V* G(p)V + D- 1 ) = 0. 

Proof. For the convenience of the reader, we give the simple proof. The claim follows from the computation 

det(H-n) = det(H - M )dct(l + (H - ^VDV*) 
= dct(H - /^)dct(l + V*(H - fi^VD) 
= dct{H - i^)dct(D)dct(D- 1 + V*(H-ft)~ 1 V) , 

where in the second step we used the identity det(l + AB) = det(l + BA) which is valid for any n x m 
matrix A and m x n matrix B. □ 

We shall also make use of the well-known Weyl's interlacing property, summarized in the following lemma. 

Lemma 6.2. If A is an N x N Hermitian matrix and B = A + dvv* with some d > and v e C N , then the 
eigenvalues of A and B are interlaced: 

X^A) Ai(fl) ^ \ 2 {A) sc • •• < Xn-i(B) sC X n (A) ^ X N (B) . 

We shall occasionally need the eigenvalues of H to be distinct. To that end, we assume without loss of 
generality that the law of H is absolutely continuous; otherwise consider the matrix H + e~ N V where V is 
a GOE/GUE matrix independent of H . It is immediate that this perturbation docs not change any of H's 
spectral statistics. Moreover, any Hermitian matrix with an absolutely continuous law has almost surely 
distinct eigenvalues. 

6.2. Warmup: the rank-one case. In order to illustrate our method, we first present a much simplified proof 
which deals with the case k = 1. Let v € be normalized and deterministic, and d € R be deterministic 
(and possibly iV-dependent) . Define the deformed matrix 

H := H + dvv*. 

For the following we note the elementary estimate 

9(d) -2 x (d- l) 2 , (6.1) 

as follows from (2.18). 

Theorem 6.3. Fix C > 0. Then there is a constant such that the following holds. For d 1 we have 

< , N -X N < <P C < N{1 _/ +N - 1/3) 
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d-l + N-V 3 



with C,-high probability. For l^d^S — 1 we have 

\iM N -o{d)\ ^ ^ v 

with C,-high probability. 

By symmetry, an analogous result holds for d ^0. 

PROOF. First we note that it is enough to consider d £ R+\ [l - ^N^ 1 / 3 , 1 + ^N^ 1 / 3 ] for some arbitrary 
but fixed D > 0. This follows from |Ajv — 2| < ip c <N~ 2 / 3 with (-high probability (see Theorem 3.7), the 
monotonicity of the map d M> Xn(H + dvv*) (see Lemma 6.2), and the observation that 0(1 + e) = 
l + e 2 + 0(e 3 ) ase^ (which implies that \6(d)-2\ ^ ^d+i n -2/z for d e [l - ip D N' 1 / 3 ,1 + tp D N' 1 / 3 ]). 
The key identity 3 for the proof is 

G vv (/J>n) = _ ^ ! 

as follows from Lemma 6.1. Let us begin with the case d ^ 1 + tp D N~ 1 / 3 . Since m : R\(— 2, 2) — > [—1, 1]\{0} 
is bijective, we find from (2.5) that 9(d) is uniquely characterized by 

m(6(d)) = -1. (6.2) 

We therefore have to solve the equation m(9(d)) = G vv (x) for x € [2 + ip Cl N~ 2 / 3 , oo), where C\ the constant 
from Theorem 2.3. By Theorem 2.3, we have 



G vv (x) = m(x) + 0(^N-^ 2 K - 1 ^) (6.3) 



with £-high probability. 
Next, define the interval 



I d := [X-(d),X+(d)}, X±(d) := 9(d) ±<p D N- 1 ' 2 (d-l) 1 / 2 . 

We claim that 

k x x (d-l) 2 , m'(x) x (d-1)- 1 (xel d ) (6.4) 
The first relation of (6.4) follows from 

\x-9(d)\ sC i P D N- 1 / 2 (d-lf/ 2 and 9(d) - 2 ^ c(d - l) 2 > C ip 3D / 2 N~ 1 / 2 (d - I) 1 ' 2 , 

where in the last step we used d ^ l+ip D N^ 1 / 3 . In order to prove the second relation of (6.4), we differentiate 
(5.5) and use Lemma 3.2 to get 

m'(x) x k^ 1 / 2 , m"(x) x k~ 3 ' 2 . (6.5) 
Therefore we get from (6.5) and the mean value theorem applied to m! that 

\m'(x)-m'(6(d))\ sc dp D N- 1 / 2 (d - lf' 2 (d - l)" 3 s$ C'ip- D/2 (d - l)" 1 . 



3 Here we ignore the possibility that /ijv S &(H). Since the law of H is absolutely continuous, it is easy to check that the 
interlacing inequalities in Lemma 6.2 are strict with probability one; see e.g. the proof of Lemma 6.7. 
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Therefore (6.4) follows from m'(0(d)) x (d — 1) 1 . 

Now choose D large enough that ar_(d) > 2 + tp Cl N~ 2 / 3 for d > ip D N- 2 / 3 . Thus (6.3) and (6.4) yield 



G vv (x_(d)) < ro(0(d)) < G vv (x+(d)) 



(6.6) 



with C-lrigh probability, provided D is chosen larger than the constant Cq in (6.3). Finally we observe 
that, by Theorem 3.7, with C - high probability the function x M> G vv (x) is continuous and increasing on 
[2 + Lp Cl N~ 2 / 3 , oo). It follows that with (-high probability the equation G vv (x) — m(6(d)) has precisely one 
solution, x = f-i 7v , in [2 + tp Cl N~ 2 / 3 , oo). Moreover, this solution lies in I4, which implies that it satisfies the 
claim of Theorem 6.3 for d > 1. 

What remains is the case d ^ 1 — ip D N~ 1 ^ 3 . Choose ir := 2 + tp Cl N~ 2 / 3 where C\ is a large constant to 
be chosen later. For large enough C\ we find from Theorem 2.3 



G vv (x) = m{x) + 0(N- 1 / 3 i P - Cl ' A ) 
with C-fiigh probability. From (3.3) we find 

l + m(x) x N- 1/3 ip Cl/2 , 

which yields 



(6.7) 



(6.8) 



l + G vv (x) > 3s 1- 



with C _ high probability. Choosing Ci large enough, we find as above that y >-> G vv (y) is with £-high 
probability increasing and continuous for y a;, from which we deduce that 

Aat < < a; 

with £-high probability. (The first inequality follows from Lemma 6.2.) 

Next, abbreviate q ■■= (p c ' 2 for some large constant Ci to be chosen later. Using Theorem 3.7 we estimate, 
for Aat «C jj, n ^ x and large enough C2, 



E 

a^N-q 



(a) 



E 



A 



(a) 



X 



l(u^,v)| 2 
(A Q - hn) 2 

2 k N- 1 

j (22fe/3jV-2/3)2 



+ C (N -2/3 



with £-high probability. In the second inequality we estimated the contribution of the eigenvalues a N/2 
using the dyadic decomposition 

U k := {a G [JV/2 , JV - g] : JV - 2 fc+1 ^ a ^ TV - 2 fe } 

combined with Theorem 3.7, the estimate 

2- 7q x (iV - a) 2 / 3 iV- 2 / 3 (a^JV/2), 
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and the delocalization estimate (5.11). A similar (in fact easier) dyadic decomposition works for the remaining 
eigenvalues a < N/2 and yields the last term of the second line. Moreover, we have 

y l(u(a) ' v)|2 < ^c + c^-v3 

a>N—q 

with ("-high probability, by Theorems 3.7 and 2.5. Recalling (6.7) and (6.8), we have therefore proved that 
with £-high probability. Therefore 

with £-high probability. Theorem 2.5 implies |(u( Q ' , v)| 2 ^ ip c <N~ x , and the claim follows. This concludes 
the proof of Theorem 6.3. □ 



6.3. The permissible region. The rest of this section is devoted to the proof of Theorem 2.7. 

Definition 6.4. We choose an event, denoted by S, of ( -high probability on which the following statements 
hold. 

(i) The eigenvalues of H are distinct. 

(ii) For all i = 1, . . . ,k and a = 1, . . . , N we have (vW , u^) ^ . 

(Hi) All statements of Theorems 2.2, 2.3, 2.5, and 3.7 hold. 

We note that such a S exists. As explained in Section 6.1, we assume without loss of generality that the 
law of H is absolutely continuous. Then conditions (i) and (ii) hold almost surely; we omit the standard 
proof. That condition (hi) holds with C~high probability is a consequence of Theorems 2.2, 2.3, 2.5, and 3.7 
(see also Remark 2.4). 

For the whole remainder of the proof of Theorem 2.7, we choose and fix an arbitrary realization H = H u 
with oj £ S. Thus, the randomness of H only comes into play in ensuring that H is of C~high probability. 
The rest of the argument is entirely deterministic. 

Fix k~, k+ e N and define fc° := k - k+ - k~ = #{i : \di\ 1}. Write 

d = (di,...,d fc ) = (d-,d°,d+) d ff = K,...,^) (a = - 0,+). 
We adopt the convention that 

dr «s ••• «s d k- < - 1 < d °i < ••• «s 4" < 1 <4 ^ ■■■ < 4+- ( 6 - 9 ) 

Abbreviate 

4> N = V == 2kip. (6.10) 
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For C2 > define the sets 
T>-(C 2 ) := 
V+(C 2 ) := 



-E + 1 < <£~ < -1 - ip° 2 tpN 



l,...,fc 



|d+ : l + ^VAT -173 dt < E- 1, i = l,...,fc + } 



} 



2?°(C 2 ) := {d° : -1 + /^-i/s ^ d o ^ x _ , i = i fc ° } 

the set of allowed d's, 



2?(C 2 ) := {(d-,d°,d+) :d ff eP CT (C 2 ),a = - 0,+}, 



and the subset 



P*(C 2 ) := {d e X>(C 2 ) : di ^ for i = 1, . . . , k} . 
Let K > denote a constant to be chosen later, and define 



S(K) := (-co , -2 + <p R N- 2 ^ U (2 - ^iV- 2 / 3 , 00) . 



(i = l,...,k ), 
(i = l,...,fc+), 



We shall only consider eigenvalues of J? in S'(.K') for some large but fixed K. 

Let C3 > denote some large constant to be chosen later. Define the intervals 

IT {A) := WdT) ^N-^i-dT 1)V2 , + ^ 8JV -l/2 ( _ dr _ 

/+(d) : = [6»(d+) - <p s >N-v*{4 -iy/ 2 ,e(di) + ^N-^(df-i)^ 

I" := {2: G R : dist(x,cr(ff)) ^ N- 2 / 3 ^ 1 } C) S(K) . 
For d e V(C 2 ) define 

T(d) := / U^U/r(d)^u||u7+(d)y 

The following proposition states that T(d) is the "permissible region" for the eigenvalues of H. Roughly, 
the allowed region consists of a small neighbourhood of each 6(di) for i € O, as well as of small neighbourhoods 
of the eigenvalues of H . The latter regions house the sticking eigenvalues. Proposition (6.5) only establishes 
where the eigenvalues are allowed to lie; it gives no other information on their locations (such as the number 
of eigenvalues in each interval). Note that, by definition of S(K), the set T(d) only keeps track of eigenvalues 
outside of the interval [-2 + ip K N- 2 / 3 , 2 - ip K N~ 2 ? 3 ] . This will eventually suffice for the statement (2.21) 
thanks to the eigenvalue rigidity estimate for H, Theorem 3.7, combined with eigenvalue interlacing; see 
(6.34) below. 

Proposition 6.5. For C3 and 62(63) large enough (depending on (, K, and the constant C\ from Theorem 
2.3) the following holds. For any d <E £>(C 2 ) and H = H u with uj e S we have 

lf(d)nl° = for all * = l,...,** 



as well as 



a(H)r\S(K) c r(d). 



(6.11) 
(6.12) 
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Proof. Clearly, it is enough to prove the claim for d € V*{C2)- We shall choose the constants Cz((, C\) 
and C2{C,,K,C\,Cz) to be large enough during the proof. (Here C\ is the constant from Theorem 2.3.) 
First we prove (6.11). By definition of S (see Theorem 3.7), we find that (6.11) holds if 

2 + ip 2d2 N- 2 / 3 -^+ d ^ 2 N- 2 / 3 > 2 + 2^N- 2 / 3 > X N + N- 2 ' 3 4,- 1 , 
which is satisfied provided that 

2C 2 > C 3 + C 2 /2 + C c . (6.13) 
In order to prove (6.12), we define, for each z e C \ <j{H), the k x k matrix M(z) through 

M l3 {z) := G vWvW) (z) + Sijdr 1 . (6.14) 

From Lemma 6.1 we find that x € <j(H) \ a{H) if and only if M(x) is singular. The proof therefore consists 
in locating x e R \ <r(H) for which M(x) is singular. 

First we consider the case x ^ 2 + tp C2 N~ 2 / 3 . On S we have 

X N sC 2 + ip^N- 2 ' 3 and Ai ^ -2 - /^iV^ 2 / 3 (6.15) 

provided C2 is large enough (see Theorem 3.7). In particular, by (6.15) and the definition of S, we have 
x u{H). By increasing C2 if necessary we may assume that C2 ^ C\, where C\ is the constant from 
Theorem 2.3. Therefore we get from Theorem 2.3 and Lemma 3.2 that 

M(x + iy) = m{x + iy)+D- 1 +0((p c iN- 1/2 n- 1/i ) (6.16) 

for all y G [— X, £]. (We include an imaginary part y ^ for later applications of (6.16); for the purposes of 
this proof we set y — 0.) 

Let i € {1, . . . , k + }. Then we may repeat to the letter the argument in the proof of Theorem 6.3 leading 
to (6.4). Provided that C 3 > + 2, where Cq is the constant in (6.16), we therefore get that 



m( X ) + * 



^ tp^N- 1 / 2 *- 1 / 4 if xil+(d). 



This takes care of the components d + in D~ x . In order to deal with the remaining components, d° and d~, 
we observe that 

m(x) G [ — 1) — c ] 

for some c > depending on S^It is now easy to put all the estimates associated with i = 1, . . . , k together. 
Recalling (6.16) and choosing C 2 large enough yields, for denoting the constant from (6.16), 



/ N 1 

m(x) + — 



for alH = 1, . . . , k provided that 



fe+ 

x e [2 + ^N- 2 '\Y]\{jl+{d). (6.17) 



i=i 
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We conclude 4 from (6.16) that M(x) is regular if (6.17) holds. 

An almost identical argument applied to d~ yields that M(x) is regular if 



fe+ 



G [-S, -2 - ^N- 2 ^] U [2 + ^N~ 2 / 3 , S] \ M| /r(d) U (J 7+(d) 



»=i 



Next, we focus on the case 



x e 



2 - V K N~ 2 /\ 2 + ^AT 2 /3] , dist(x, <7(if)) > iV" 2 / 3 ^ 1 . 



(6.18) 



(6.19) 



Our aim is to prove that M(x) is regular for any x satisfying (6.19). Once this is done, the regularity of 
M(x) for x satisfying (6.18) or (6.19) will imply (6.12). Choose rj ■= TV -2 / 3 ?/; -1 and estimate 



\G v a )v u)(x) - G v a) v u)(x + < ^2 



|(u(«), v W)| 2 + |(u( a ),v«)| 2 



1 



X a — x X a — x — IT/ 



E(l(u'"',v«)l^l(u^,v0))| 2 )__ x)2+i)2 

= Im G v (, )v( i) (x + it]) + Im G vU ) v u) (x + if)) , 

where in the second step we used (6.19). Therefore, by definition of S (See also Theorem 2.2) and Lemma 
3.2, we get (recall that ip > 1) 



G v (O v o»(aO - 5 ij m{x + iri) + o(^ c <lmm{x + ir ] )+'^j = +0 (V^JV" 1 / 3 ^ + + ^c 2 , • 



))■ 



This implies, for any x satisfying (6.19), that 



M{x) = -1 + D- 1 +0[^N- 1 ^(iP + ^ R/2 + ip d2/2 



(6.20) 



Since 



-1 + 



for all i, we find that M{x) is regular provided C2 is chosen large enough that 

C 2 -l > C c +K/2 + C 2 /2. 
This completes the analysis of the case (6.19). The case 



x e 



-2 - ip^N- 2 / 3 , -2 + <p R N- 2 / 3 } , dist(x, a(H)) > N' 2 ^ 1 



is handled similarly. This completes the proof. 



□ 



4 Here we use the well-known fact that if A g <r(A + B) then dist(A, cr(A)) < . 
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6.4. The initial configuration. In this section we fix a configuration d(0) = d that is independent of N, and 
satisfies k° = as well as 

-S + 1 < < • • • < d~_ < -1 , 1 < d+ < ■ ■ ■ < d+ + ^ E - 1 . (6.21) 

Note that d <G V*{C2) for large enough AT. 
First we deal with the outliers. 

Proposition 6.6. For N large enough, each interval /~(d), i = l,...,k~, and I^{d), i = l,...,k + , 
contains precisely one eigenvalue of H. 

Proof. Let i e {1, . . . , k + } and pick a small A^-indcpendent positively oriented closed contour C C C\[— 2, 2] 
that encloses 6(df) but no other point of the set U<x=± UiLiW^f)}- By Proposition 6.5, it suffices to show 
that the interior of C contains precisely one eigenvalue of H. Define 

f N (z) := det(M(z)+D- 1 ) , g{z) := det(m(z) + D^ 1 ) . 

The functions g and /jv are holomorphic on and inside C (for large enough N). Moreover, by construction 
of C, the function g has precisely one zero inside C, namely at z = 9{df). Next, we have 

mm\g{z)\ > c > 0, - M*)l < ¥> c <i\r 1/2 , 

zee 

where the second inequality follows from (6.16). The claim now follows from Rouche's theorem. The 
eigenvalues near 0(d~), i = 1, . . . , k~ , are handled similarly. □ 

Before moving on, we record the following result on rank-one deformations. 

Lemma 6.7. Let v e C fe be nonzero. Then for all i = 1, . . . , k — 1 and all Hermitian k x k matrices A we 
have 

lim \i(A + dvv*) = lim X i+1 (A + dvv*) . 

d—>oo d— > — oo 

Proof. By Lemma 6.1, we find that x ^ <r(A) is an eigenvalue of A + dvv* if and only if 

(v.fA-^v) = -J. 

Let 

E ■■= | A : the eigenvalues of A are distinct , (v,u^(A)) 7^ for alH j , 

where uW(A) denotes the eigenvector of A associated with the eigenvalue \i(A). (Note that u^ l \A) is 
well-defined in E, since the eigenvalues are distinct.) It is not hard to see that E c is dense in the space of 
Hermitian matrices. 

We write the condition (v, (A — a;) _1 v) = — d^ 1 as 

v l<v,u(*W _ _1 
JW - 2^ Xi(A) — x ~ d' 

Let A e E. Then / has k singularities at the eigenvalues of H, away from which we have /' > . Moreover, 
f(x) t as x t 00, and /(x) ! as x I —00. Thus, for any (I e R \ {0}, the equation f{x) = —d^ 1 
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has exactly k solutions in E \ o~(A). Since A + dw* has at most k distinct eigenvalues, this proves that 
a (A + dw*) n cr(A) = for all d e I. Moreover, the equation f(x) = has exactly k — 1 solutions, 
X\,..., Xk-i- Since f'(xi) > for each i = 1, . . . , k — 1, it is easy to see that Xi = lim^oo Xi(A + dw*) = 
lim d _ ) ._ (X) X i+ i(A + dw*). 

Now the claim follows by approximating an arbitrary matrix A by matrices in E, and by using the 
Lipschitz continuity of the map A i->- Aj (A) . □ 

We now deal with the extremal bulk eigenvalues. 

Proposition 6.8. Fix < 5 < 1/3 and if > 0. Let d &e N -independent and satisfy (6.21). Then for large 
enough N (depending on 6 and K) we have for all a satisfying A Q ^ 2 — tp K N~ 2,>3 that 

\X a -fl a _ k+ \ < N- 1+S . 

Similarly, we have for all a satisfying X a < — 2 + tp K N~ 2 / 3 that 

|A Q -Ma+fc-| sS N- 1+s . 

Proof. We only prove the first statement; the proof of the second one is almost identical. Abbreviate 
5':= 8/2. 

Before embarking on the full proof, we first give a sketch of its main idea, under some simplifying 
assumptions. Let A <G N be some fixed constant, and assume that, for each a > N — A, the neighbours of 
A a are further than N~ 1+s away from X a . (This assumption in fact holds with probability 1 — o(l), a fact 
we shall neither use nor prove.) We claim that there is at least one eigenvalue of H in the interval 
surrounding X a , where 

x% ■= X a ±N- 1+s '/3. 

Before sketching the proof of the above claim, we show how to use it to conclude the argument. By 
Proposition 6.6, there are at least k + eigenvalues in (a;^,oo). Recall that by assumption fc° = 0, i.e. \di\ > 1 
for all i. Therefore using interlacing, i.e. a repeated application of Lemma 6.2, we conclude that there are 
exactly k + eigenvalues in (2^,00). From the above claim we find that there is at least one eigenvalue in 
[x^jX 1 ?]. Using interlacing we find that there are at most k + + 1 eigenvalues in [x^,oo). We conclude that 
there is exactly one eigenvalue in [x^[,x+]. We may move on to the (N — l)-th eigenvalue: we have proved 
that there are (i) at least k + + 1 eigenvalues in \x^,oo) (from the previous step), (ii) at least one eigenvalue 
in [x^ -1 , a;^ -1 ] (from the claim), and (iii) at most k + + 2 eigenvalues in [a:^ -1 , 00) (from interlacing); we 
conclude that there is exactly one eigenvalue in [x^^ 1 , x^ 1 ]. Continuing in this fashion concludes the proof. 

Let us now complete the sketch of the proof of the above claim. Assume forjsimplicity that H and H 
have no common eigenvalues. From Lemma 6.1 we find that x is an eigenvalue of H if and only if the matrix 
M(x), defined in (6.14), is singular. Thus, we have to prove that there is an x e such that M(x) is 

singular. The idea of the argument is to do a spectral decomposition of G, and resum all terms not associated 
with A Q to get something close to Rem(x) w — 1. More precisely, we write 

_ (vW,nC))(uM,vM) ^ (yM , , v^) x 

Mij[X) ~ X a ~x + £-< X, 3 -x 

-^+Rem(x)8 ij +S ij d i 1 , 



X a — x 
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where the sum over j3 was replaced with Kem(x)Sij (up to negligible error terms). This approximation 
will be justified using Theorems 2.2 and 2.5; it uses that x € [x",x"] and consequently all eigenvalues 
A^, (3 a, arc separated from x by at least N~ 1+d /3. Introducing the vector y = (yi) e C fe , defined by 
■tji := (vW , u^), we therefore get 

Mix) sa 1 + (6.22) 

where we used that Rem(x) rts —1. By assumption, \di\ > 1 for all i; therefore the matrix —1 + D^ 1 is 
strictly negative. Also, Theorem 2.5 implies that < (p c <N^ 1 ^ 2 . Thus it is easy to conclude that all 
eigenvalues of M(x") are negative. The first term on the right-hand side of (6.22) is a rank-one matrix. As 
x approaches A a from the left, its nonzero eigenvalue tends to +oo. By continuity, there must therefore exist 
an x € [x" , X a ) such that M(x) is singular. This concludes the sketch of the proof of the claim. 

Now we turn towards the detailed proof in the general case. Since eigenvalues of H may be separated by 
less than N~ 1+s , we begin by clumping together eigenvalues of H which are separated by less than N~ 1+s . 
More precisely, we construct a partition A = (A q ) q of {1, . . . , N}, defined as the finest partition in which a 
and (3 belong to the same block if |A Q — A^| N~ 1+s . Thus, each block consists of a sequence of consecutive 
integers. We order the blocks of A in a "decreasing" fashion, in such a way that if q < r then \ a > \p for 
all aei, and (3 e A r . 

We now derive a bound on the size of the blocks near the edge. Roughly, we shall show that if A e A q 
and A > 2 — tp c N~ 2 / 3 then \A q \ ^ (p c . Let C4 be a large constant to be chosen later. Now choose a and /3 
satisfying ^ a < (3 < ip Ci such that N — a and N — (3 belong to the same block. Then by definition of S 
and A we have 

c[(/3/A0 2/3 - W^) 2/3 ] -V C( N- 2 / 3 ^ \ N - a -\ N -p (P-a)N- 1+s \ 
where we used the statement of Theorem 3.7 and the definition (3.17). Thus we get the condition 

iV- 2 / 3 [c/r 1/3 03-cO-^] sC N- 1+s ' \/3 - a) . 

We conclude that if a and (3 satisfy < a < (3 < (p Ci and A — a and A — (3 belong to the same block, then 

/? - a < p c c+<V3+i (6 23) 

Let a* denote the largest integer such that Xn-u, ^ 2 — ip K N~ 2 / 3 . In particular, by definition of 5 (see 
Theorem 3.7) we have 

a, < ^/ 2 + c c . (6.24) 
Now we choose C4 = Ci(Q, K) large enough that 



C 4 max(3A/2 + C c , C c + C 4 /3 + l) + 2 . 



Next, define Q through A — a» 6 Aq. Therefore wc: get from (6.23) and (6.24) that any a < such that 
A — a e satisfies 

a < a* + ^ <+ C 4 /3+i ^ ^-1 _ 
Since blocks are contiguous, we conclude that 

\A q \ < p^" 1 . (6.25) 
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for each q = 1, . . . , Q. Moreover, by definition of S (see Theorem 3.7), we find 

|A N _ Q -2| ^ ^2C 4 /3+C civ -2/^ 



for all q = 1 , . . . , Q and all a such that N — a G A q . 

Now we are ready for the main argument. Pick q G {1, 



, Q} and abbreviate 



a q ■■= min X a , 



ueA„ 



max X a . 

a£A„ 



We introduce the path 



:= a q - N- 1+s '/3 + (b q -a q + 2N- 1+S ' /3) t, (t G [0, 1]) , 



which will serve to count eigenvalues. (Note that x 9 , = a q — N~ 1+s /3 and x\ = b q + N~ 1+s /3.) The interval 
[xq, x\] contains precisely those eigenvalues of H that are in A q , and its endpoints Xq and x\ are at a distance 
greater than N~ 1+s /3 from any eigenvalue of H. Thus, [xg,a;^] is the correct generalization of the interval 
[x",x"] from the sketch given at the beginning of this proof. 

In order to avoid problems with exceptional events, we add some randomness to D. Recall that D satisfies 
(6.21). Let A be a k x k Hermitian random matrix whose upper triangular entries are independent and have 
an absolutely continuous law supported in the unit disk. For e > define 

H e ■= H + VID- 1 +eA)~ 1 V* . 

From now on we use "almost surely" to mean almost surely with respect to the randomness of A. Our main 
goal is to prove that for each e > 0, almost surely, there are at least \A q \ eigenvalues of H £ in [xg, xf\ \ <j{H). 
(Having done this, we shall deduce, by taking e — > 0, that H has at least \A q \ eigenvalues in [xq,x^].) 
For x £ cr(H) define 

Mfj(x) := G v (i )v o-)(z) + Sijdi 1 + sAij (i, j = 1, . . . , k) . 
Then (assuming x (jz. cr(H)) we know that x G a(H 6 ) if and only if M e (x) is singular. Split 

^ (v«,u( a ))(u("),v«) ^ (vW,u( Q ))(u( Q ),v(->)) 
G vWvW >(aO = 2^ xT^x + ^ Y^x • 

a£A q a£A q 

Let X G \x g, X-y ]. Similarly to the proof of (6.20), we choose -q ■= N 1+5 and estimate 
( v W ;U («))( u («) jV 0-)) (vW,uW)(uW,vW) 



E 

a^A q 



Aq, — X 



E 

a£A q 



X a — x — if] 



sC 2^ImG v (,) v( ,)(x + i77) +ImG v o) v ( 



where we used that \x — \ a \ ^ 2N 1+5 /3 for a ^ A q . Moreover, 

(vW,u( a ))(u( a ),vW) 



E 

a£A„ 



X a — X — IT] 
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where we used (6.23) and the definition of s (see Theorem 2.5). Estimating G v (i) v u)(x + irj) — m(x + irj) 
therefore yields, similarly to (6.20), 



E 



(vW,u( a ))(uW,vW) 



5a + S^dr 1 + eAij + 0(^ C < +C *N- S '/ 2 ) . 



A Q — X 



Introducing the vector 





(v«,u( Q )) , 



we get 



M e (x) = 



y(«)(y(°0)* 



1 + D' 1 + eA + R(x) , 



R(x) = 0(<p c < +c 'N- s '/ 2 ) , 



(6.26) 



where i?(x) is continuous in a; and independent of A. Compare this to (6.22) in the sketch given at the 
beginning of the proof. By Theorem 2.5, for a G A q we have 



We may now start the counting of the eigenvalues of H in [xq,x^]. We have to prove that there are 
at least L ■= \A q \ distinct points x in [xg,xf] at which M e (x) has a zero eigenvalue. As in the simple 
continuity argument given in the sketch at the beginning of this proof, we shall make use of continuity. 
However, having to find L such values x instead of just one is a significant complication 5 . Before coming 
to the full counting argument, we give a sketch of its main idea. See Figure 6.1 for a graphical depiction 
of this sketch. We extend the real line E, on which the eigenvalues of M e (x) reside, to the real projective 
line E = E U {oo} = S 1 . One can think of 1 as a ring with two distinguished points, at the bottom 
and oo at the top. Thanks to Lemma 6.7, it is possible to label the k eigenvalues of M e (xj) so that they 
are continuous E- valued functions (denoted by ef (t), . . . , e|(t) below) on [0, 1]. Thus, we get a family of k 
beads moving continuously counterclockwise on a ring. At t = 0, the eigenvalues are all strictly negative 
(and finite), i.e. all beads lie in the left half of the ring. As t is continuously increased from to 1, the 
beads move counterclockwise around the ring. Our goal is to count the number of times is hit by a bead. 
Thanks to the explicit form of the first term on the right-hand side of (6.26), we know that the point oo is 
hit exactly L times as t ranges from to 1. Since at time t = all beads were in the left half of the ring, 
and since the beads move continuously counterclockwise, we conclude by continuity that is hit at least L 
times as t ranges from to 1. Below, we denote the times at which oo is hit by s\, . . . , Sl, and the times at 
which is hit by t\ , . . . , fx, . One nuisance we have to deal with in the proof is the possibility of several beads 
crossing one of the two points or oo simultaneously. Such events are not admissible for our counting. For 
instance, if at time t a bead is at while another is at oo, we cannot conclude that x\ is an eigenvalue of H; 
indeed, because there is a bead at oo, we know that x\ is an eigenvalue of H, and hence Lemma 6.1 is not 
applicable. However, such pathological events almost surely do not occur. Avoiding them was the reason 
for introducing A. Note that the final result of the counting argument the number of eigenvalues of H £ in 
[xq, x\] - is stable under the limit e — >• 0. This will allow us to conclude the proof. 

5 This complication is also visible in the joint arrangement of the eigenvalues of H and H. If all eigenvalues of H are well- 
separated (by at least N~ 1+s ) then, as outlined in the sketch at the beginning of the proof, each eigenvalue Aq, of H has an 
associated eigenvalue of H, which lies in the interval [X a — A r_1 + <s /3, X a )- In fact, this eigenvalue typically lies at a distance 



\y^\ = 0(<p c <N-^). 



(6.27) 
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Figure 6.1: A graphical representation of the movement of the eigenvalues (or "beads") ef (t), ef(£) of M e (xJ) 
as t ranges from to 1. In this example we have L = 3, k = 2, and < t\ < t 2 < Si < s 2 < t 3 < s 3 < 1. 

Now we give the full proof. Recall that \di\ > 1 is independent of N for all i. Thus we get from (6.26) 
and (6.27) that, for large enough N and small enough e, all eigenvalues of M £ (xq) are negative. (Here we 
used that \X a — Xq\ > N~ 1+s /3 for a e We shall vary t continuously from to 1 and count the number 
of eigenvalues crossing the origin. Let L := \A g \ and denote by 

< s 1 < s 2 < ■ ■ ■ < s L < 1 

the values of t at which x\ e cr(H). (Recall that the eigenvalues of H are distinct.) It is also convenient to 
write s — and s i+1 = 1. For t G [0, 1] \ {si, . . . s^}, let 

e\(t) < e|(i) < ... < e%{t) 

denote the ordered eigenvalues of M £ (xf). We record the following fundamental properties of ef (t), . . . , e|(t). 

(i) For alH = 1, . . . , k, we have ef (0) < for N large enough and e small enough (depending on N). 

(ii) For every i = 0, . . . , L and i = 1, . . . , k, the function ef is continuous on (s e , s e+ i). 

(iii) At each singular point se, I = 1, • • • , L, we have 

e<(*7) = 4+M) (i = l,...*-l). 

Af _1 to the left of A Q , as follows from (6.22) and the fact that the typical size of y is A^ 1 / 2 . However, if two eigenvalues of 
H are closer than N^ 1 , this simple ordering breaks down. In general, therefore, all we can say about the eigenvalues of H 
associated with the eigenvalues of H in A q is that they are close to the group {\ a }aeA q - Since the diameter of this group is 
small (sec (6.28) below), this will be enough. 
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(In particular, both one-sided limits exist.) 

Property (i) was proved after (6.27). Property (ii) follows from (6.26). Property (iii) follows from Lemma 
6.7, using (6.26) and the fact that R{x) is continuous. 

Moreover, the two following claims are true almost surely. 

(a) For each I — 1,...,L and i = 1, . . . ,k — lwe have ef(sj) ^ 0. (The remaining index k satisfies 
e U s e) = +°°.) 

(b) If ef (t) = for some t G [0, 1] \ {si, ...,s L } then e £ \t) ^ for all j ^ i. 

In terms of beads ef (t), . . . , e|(i) G i? (see below), the properties (a) and (b) can be informally summarized 
as: (a) if a bead is at oo then there is no bead at 0, (b) at most one bead is at 0. We omit the standard 6 
proofs of (a) and (b), which rely on the fact that the law of A is absolutely continuous. 

In order to conclude our main argument, it is convenient to regard the eigenvalues ef(i),... , e|(t) as 
elements of R = R U {oo} = S 1 , the real projective line. From properties (ii) - (iii), it is apparent that we 
may rearrange the eigenvalues of M e {xD as ef (t), . . . , e|(t) G R and extend them to functions ("beads") on 
whole interval [0, 1] in such a way that, almost surely, each ef is a continuous R- valued function on [0, 1]. 

We now claim the following. 

(*) Almost surely, there are L distinct times t\ < £2 < ■ ■ ■ < *l G [0, 1] \ {si, . . . , s_l} such that for each 
I = 1, . . . , L there is an i = 1, . . . , k with ef (te) = 0. 

Let us prove (*). Let rn G N denote the number of times that ef hits 00 as t ranges from to 1. From (6.26) 
we find that 5Z i= i n i = L (recall that the eigenvalues of H are distinct). Moreover, again from (6.26), we 
find that each such passage of 00 by ef always takes place in the same direction, namely from the positive 
reals to the negative reals with t increasing. More precisely, if ef(i*) = 00 then there is a neighbourhood 
/ 9 such that for all t G / we have 

ef(t) G R+ for t<U and ef(i) G E_ for t>t*. 

Since at time zero we have ef (0) G R- (see Property (i) above) we conclude that ef has at least rij distinct 
zeros. (Recall that m was defined as the number of times ef hits 00.) Moreover, by Property (a), the zeros 
ef are almost surely in [0, 1] \ {si, . . . ,Sl}- By Property (b), the zeros of ef , . . . , e| are almost surely disjoint. 
Since ^ j=1 n; = L, the claim (*) follows. 

From (*) we conclude that, almost surely, M £ (x) is singular in at least L points in the set [xq, x\] \ cr(H). 
Therefore H e has almost surely at least L eigenvalues in [x^x'j. Taking e — > 0, we find that H has at least 
L = \A q \ eigenvalues in [xg,xf]. 

What remains is to prove that H has at most \A q \ eigenvalues in [xq, x\}. We prove this using interlacing, 
similarly to the corresponding argument given in the sketch at the beginning of the proof. Together with 
Proposition 6.6, we have proved that there are at least \A\ \ + k + eigenvalues of H in [xj, 00). By interlacing 
(i.e. a repeated application of Lemma 6.2), we find that there are at most \A\\ + k + eigenvalues of H in 
[xq,oo). We deduce, again using Proposition 6.6, that there are exactly \A\ \ eigenvalues of H in [xq,x}]. 

6 The "standard" arguments rely on the fact that the set of singular Hcrmitian matrices is an algebraic variety of codimension 
one. In addition, the proof of (a) requires the following fact. Let P be a rank-one orthogonal projector on C k and A a Hermitian 
k X k matrix; then, as x — > ±00, exactly k — 1 eigenvalues of the matrix A + xP converge, and their limits coincide with the 
eigenvalues of A restricted to a map from ker P to ker P. The proof of (b) uses that the set of Hermitian matrices with multiple 
eigenvalues at zero is an algebraic variety of codimension two. 
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We have proved that there are at least |Ai| + |^4 2 | + & + eigenvalues of H in [xq,oo). Using eigenvalue 
interlacing, we find that there are at most \A\ \ + \A 2 \ + k + eigenvalues of H in [xq,oo). We conclude that 
there are exactly \A2\ eigenvalues of H in [xQ,a;f]. 

We may now repeat this argument for q — 3,4, ... ,Q, to get that H has exactly \A q \ eigenvalues in 
[xq, x\], for q = 1, 2, . . . , Q. Moreover, by (6.25), we find for any aei, that 

sup{|a;-A a | : a E A q , x E [a;g,a;?]} sC ip C4 N~ 1+s ' ^ N- 1+5 . (6.28) 

Therefore the proof is complete. □ 



6.5. Bootstrapping and conclusion of the proof of Theorem 2.7. We may now complete the proof of Theorem 
2.7. In order to extend the statements of Propositions 6.6 and 6.8 to arbitrary iV-dependent configurations 
d E T>(C2), we continuously deform an iV-independent d, for which Propositions 6.6 and 6.8 hold, to the 
desired ./V-dependent d. The statements of Propositions 6.6 and 6.8 remain valid for all intermediate d's; 
this will follow from the continuity of the eigenvalues of H as a function of d and from Proposition 6.5. 
Roughly, Proposition 6.5 establishes a forbidden region, for arbitrary d, which the eigenvalues of H cannot 
cross since they are deformed continuously. 

Let d(l) = d]v(l) E V*{C2) be given (and possibly iV-dependent), with associated TV-independent indices 
k~ , fc°, k + . Choose an iV-independent d(0) E T>(C 2 ) with the same indices k~,k°,k + , such that d°(0) = 
and (d~ (0), d + (0)) satisfies (6.21). We shall use a bootstrap argument by choosing a continuous (possibly 
iV-dependent) path (d(t) : < t < 1) that connects d(0) and d(l). We require the path d(t) to have the 
following properties. 

(i) For all t E [0, 1] the point d(i) satisfies (6.9) and d(t) E V{C 2 )- 

(ii) If Z+(d(l)) n J/(d(l)) = for a pair 1 sC i < j ^ k+ then 1+ (d(i)) n i+(d(t)) = for all t E [0, 1]. 
The same restriction is imposed for + replaced with — . 

It is easy to see that such a path exists. Informally, condition (ii) states that if the allowed regions for the 
outliers i and j do not over lap at time t = 1 (i.e. the outliers can be distinguished), then they may not 
overlap at any earlier time. 

We continue to work at fixed N and with a fixed realization H = H u with uj E E. Let C 2 and C3 be the 
constants from Proposition 6.5, and choose S > such that ip < iV 1 / 3-5 . Define 

H(t) := H + Vdiag(d 1 {t),...,d k (t))V 

and abbreviate (J, a (t) = X a (H(t)). By Propositions 6.6 and 6.8, we have that 

MiV-fe++»(0) € i+(d(0)) (i = l,...,fc+), (6.29a) 
Hi(0) E /r(d(0)) (i = l,...,k~), (6.29b) 

\ a > 2-^iV- 2 / 3 => |A a - Ma _ fe+ (0)| < N- 2 / 3 ^- 1 , (6.30a) 
A Q sc -2 + ^ R N- 2 ' 3 => |A a -/i a+fc -(0)| s$ N- 2 ^- 1 . (6.30b) 



as well as 
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In order to invoke a continuity argument, we note that Proposition 6.5 yields 

a(H(t))f]S(K) c r(d(t)) (6.31) 

for all t € [0, 1]. Moreover, since 1 1-> is continuous, we find that is continuous in t e [0, 1] for all 

a. 

Let us first analyse the outliers. We focus on the positive outliers associated with d + ; the negative ones 
are dealt with in the same way. Assume first that the k + intervals 1^ (d(t)), . . . , Ji (d(t)) arc disjoint for 
t = 1. Then, from Property (ii) above, we know that they are disjoint for all t £ [0, 1]. Thus we find, from 
(6.29), (6.31), and the continuity of 1 1-> /x a (t) that 

/i JV -fc+ +i (t) e /+(d(t)) (i = l,...,fc+) (6.32) 

for all t S [0,1], and in particular for t = 1. 

If 1+ (d(l)), . . . , 7^ + (d(l)) are not disjoint, the situation is only slightly more complicated. Let B denote 
the finest partition of {1, ... , k + } such that i and j belong to the same block of B if I i f (d(l)) n/+(d(l)) ^ 0. 
Note that the blocks of B are sequences of consecutive integers. Denote by Bi the block of B that contains 
i. Then (6.29) and (6.31) yield, instead of (6.32), that 

»N- k++i (t) G |J #(d(t)) (i = l,...,fc+) (6.33) 

for all t € [0, 1]. At t = 1, the right-hand side of (6.33) is an interval that contains #(dj) for all j e B^. In 
order to estimate its size, we pick a j € £?j that is not the largest element of To streamline notation, 
abbreviate d := d+(l) and d' := d+ +1 (l). Our first task is to estimate d'-d. Since 7 j f (d(l))n/+ hl (d(l)) ^ 0, 
we have 

i 1 - jjf)^- d ^ < 8{d')-e{d) < 2^iV- 1 /2( rf '_ i)V2. 
where the second inequality follows from the definition of I^(-). This yields 

d'-d < C^ 3 AT- 1 /2( rf ' _ ^-1/2 ^ C ^C 3iV -l/2 (d _ ^-1/2 j 

where the constant C depends only on E. Thus we get 

(d'-l) 1 / 2 SC (d- lf' 2 [l + d J^) «S (d-l) 1 / 2 (l + C^JV- 1 /2( d _l)-3/2^ ^ (d-l) 1 /2(l + o( l )); 

where the last inequality follows from (6.13). Repeating this estimate of #(d+ +1 (l)) — 9(d^(l)) for the 
remaining j € Bi, we find 

diam( |J /+(d(l)) ) < (2|B l |+2)^ 3 7V- 1 / 2 mm(d+(l)-l) 1 / 2 (l + o(l)). 
This immediately yields 

\HN- k++ iW 0(d+)\ < ^ + 1^-1/2(^(1) _ 1)1/2 (i = 1, . . . >fe +) , 
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and the claim follows. 

What remains is the analysis of the extremal bulk eigenvalues. Once again, we make use of a continuity 
argument. As before, we only consider positive eigenvalues, \ a ^ 2 — (p K N~ 2 / 3 for some K to be chosen 
below. Note that by interlacing, Lemma 6.2, we have 

A a -fc Ha A Q+fc (6.34) 

(using the convention that X a = +oo for a > N). Recall the role of K from the assumptions of Theorem 
2.7. Therefore using the definition of H (see Theorem 3.7), we find that there is a K = K(K) such that if 
a > N -tp K then 

A«_ fe ^ 2-ip R N- 2 ^ 3 and fi a > 2 - <p R iV" 2 / 3 . 
Let now a satisfy N — ip K ^ a ^ N — k + . Using (6.30), (6.31), and Proposition 6.5, we find 

\K+k+ -Ma (0)| s$ N- 2 ^^- 1 and dist(p a {t), a{H)) ^ N^^Tp' 1 (6.35) 
for all t G [0, 1]. In addition, we know the two following facts about fi a (t), for all t e [0, 1]. 

(i) H a (t) is in the same connected component of 1° C M as \x a (0) (by continuity of Ha (t) and Proposition 
6.5). 

(ii) Ha(t) satisfies the interlacing bound (6.34) for all t e [0, 1]. 

Let B a be the set of /? = 1, . . . , N such that A^ and X a are in the same connected component of 1°. Thus 
we conclude from (i) and (ii) that 

M*) G U - ^ 2/3 ^ , V + N- 2 / 3 ^ 1 ] . 

/3eB Q+fc+ : 

|a+fe+-0|^fe 

Thus we get 

\K+k+ -M«WI < 2fciV- 2 / 3 ^ 1 (6.36) 

for all t e [0, 1]. Choosing 

C2 := C2 + 1 , C3 := C3 + 1 
completes the proof of Theorem 2.7 (recall the definition (6.10)). 

7. Distribution of the outliers: proof of Theorem 2.14 

7.1. Reduction to the law of G v (») v m (9(di j). The following proposition reduces the problem to analysing a 
single explicit random variable. 

Proposition 7.1. There is a constant C2, depending on (, such that the following holds. Suppose that 

\di\ < E-l, |K|-1| > ^N^ 3 
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for alii = l,...,k. Suppose moreover that for all i € O (2.24) holds. Recall the definitions (2.16) and 
(2.17). Then we have for all i G O 

A^a^-ir 172 ^)-^)) = -(l+O(^ 1 ))(M 4 |+l)7V 1 / 2 (M l hl) 1 / 2 (^G v(l , v(l ,(0(^)) + ^+O(^- 1 ) 

with (-high probability. 

Before proving Proposition 7.1, we record the following auxiliary result. 
Lemma 7.2. Let C\ denote the constant from Theorem 2.3. For any 

x e [-s,-2-^ Cl iV- 2 / 3 ] u [2 + ^ Cl 7V- 2 / 3 ,s] 



and any normalized v e 



•<iv 



\8 x G vv (x) - d x m(x)\ s? ^cjv-VSfj-i 
twi/i (-high probability. More generally, we have, for any normalized v,w G C w , 

|a x Gvw(a;)-5xm(x)<v,w)| < ^N-^k' 1 
urei/i (-high probability. 

Proof. By symmetry, we may assume that x ^ 0. Moreover, (7.2) follows from (7.1) and polarization. 
We therefore prove (7.1) for x ^ 0. We have 

^ KuW v)l 2 

Choose x > 2 + N~ 2 / 3 tp Cl and abbreviate k = k x . Thus we get, for 77 > ^ AT -1 , 



(7.1) 
(7.2) 



d x G vv (x) ImG vv (i + ii]) 

77 



E 



l(u^,v)| 2 

(A a - x) 2 



E 



(Aa - xY + r? 2 



1 

(a; - Aw) 2 n 
v 2 1 



Im G vv (a; + in) 



^ 2^-ImG vv (x + ir)) 
ft 77 



with C"high probability, where in the last step we used Theorem 3.7. (In the proof of Theorem 2.3, the 
constant C\ was chosen large enough for this application of Theorem 3.7; see (5.6).) A similar calculation 
using the definition (2.4) yields 



d x m(x) — — lmm(x + irf) 

Therefore we get, using Theorem 2.3 and Lemma 3.2, 

2rj 



v 2 1 

< —^-\mm(x + \ri). 



\d x G vv (x) — d x m(x)\ < — ^ ( ImG vv (i + irj) + Imm(x + ir)) ) H — ip' 

1 K V / f] 

with C-Iiigh probability. Choosing 77 := A^ 1 / 6 /* 3 / 4 yields the claim. 



1 c lmm(x + in) 



Nr, 



□ 
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Proof of Proposition 7.1. We only prove the claim for the case di > 1; the case < —1 is handled 
similarly. 

For 2 + (p Cl N~ 2 / 3 ^ x ^ E, where Ci is the constant from Theorem 2.3, we define the fc x fc Hermitian 
matrices A(x) and .A(x) through 

:= G v (i) v0 )(x) - m(x)8ij + d^ 1 5 ij , ^-(x) := % (G vW vW (x) - m(x) + d" 1 ) . 

(Here we subtract m(x)l so as to ensure that d x A{x) is well-behaved; see below.) We denote the ordered 
eigenvalues of A(x) and A(x) by ai(x) ^ • • • ^ cifc(x) and ai(x) • • • $J <x/c(x) respectively. 

For the rest of the proof we fix i G O satisfying di > 1. We abbreviate Qi := 8{di). We begin by comparing 
the eigenvalues of A(9i) and D^ 1 . Define the eigenvalue index r = r(i) = 1, . . . , k through 



a r (x) = — + G v (i) v (i) (x) - m(x) 
di 



In particular, 



Theorem 2.3 implies that 



a r (0i) = G v «) v «)(0i) + 



di 



G v u )vW (6i)-m(6i) < v Cf cj V - 1 /2( di _i)-i/2. 



with C-high probability for j = 1, . . . , k. In particular, 

1 



a r (6i) - 



/2 



with C-high probability. Moreover, (7.4) and the condition (2.24) yield, for j ^ i, 

G v o) v0 ) (9i) - m(0i) < \di - dj \ 
with C-high probability, provided G2 is chosen large enough. We therefore conclude that 

minloj-^O-Or^i)! > ip *' 1 N-V 2 (di - l)- 1 ' 2 



(7.3) 



(7.4) 



(7.5) 



(7.6) 



with C-high probability, provided G2 is large enough. 

Next, we compare the eigenvalues of A(6i) and A(6i) using second-order perturbation theory (the first- 
order correction vanishes by definition of A and A). Theorem 2.3 yields 

\\A{6i) - A{6i)\\ s$ tp^N-V^di-l)- 1 ' 2 

with C-high probability. Therefore (7.6) and nondegenerate second-order perturbation theory yield, for large 
enough G2, 



a r (0i) = a r (9 l ) + 



c (N~ 1 (di — l) -1 



ij^ r \aj(9i) - a r (9i)\ 



OriPi) + O^-O'N-^idi - 1)- 1/2 ) (7.7) 
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with £-high probability. 

Next, we analyse A(x) and make the link to H a (i)- From Lemma 7.2 we find 

with C-high probability. In particular, we have for all j = 1, . . . , k that 

\ aj (x) - aj (y)\ < V > c <N- 1 / 3 ( K - 1 +K- 1 )\x-y\ (7.8) 

with C-high probability, provided that 2 + ip Cl N~ 2 / 3 ^ x,y < S. 

Recall the definition (2.17) of a(i). From Lemma 6.1 and Theorem 3.7, we know that fi a ^) is characterized 
by the property that there is a q = q(i) e {1, . . . , k} such that 

a q (p a (i)) = -m(/i Q(i )) . 

By Theorem 2.7 we have 

I/Mi) -^1 < ^N-^id, - 1)V2 (7.9) 
with C-kigh probability. Provided C2 is large enough (depending on C3), it is easy to see from (7.9) that 

fi a(i) - 2 x 0, - 2 x (d< - l) 2 (7.10) 
with C-higfi probability. Thus we find, using (7.8), (7.9), and (7.10), that for large enough C 2 we have 

m(/ia«)) = -o,(fli) + o(ip c <N- 5 ' 6 (di - l)- 3/2 ) (7.11) 

with C-high probability. (Here we absorbed the constant C3 into C^.) 

We now prove that q = r with £-high probability provided C2 is large enough. Assume by contradiction 
that q 7^ r. Then we get, using Theorem 2.3 and the condition (2.24), that 



> ^- 1 7V- 1 / 2 (d 4 - I)" 1 / 2 (7.12) 
with C-kigh probability. Moreover, (7.8), (7.9), and (7.10) yield 

a q (0i) = o,(/i Q( i)) + 0(^A- 5/6 (d, - I)- 3 / 2 ) 
= -m(Ma«) + 0(y c <N- & l\di - l)- 3/2 ) 
= I + O^N-^idi - 1)- 1/2 + <p c <N- 6 ' e (di - l)- 3/2 ) 

with C-high probability, where in the last step we used (6.5). Together with (7.12), this yields the desired 
contradiction provided C2 is large enough. Hence q = r. 
Putting (7.3), (7.11), and (7.7) together, we get 

mUMi)) - -G v covCo(*i) - I + 0(^7V- 5 / 6 (d 4 - I)- 3 / 2 + ^-c^-i/^ _ ^-1/2) 
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with £-high probability. Thus we find that, for all x between 9i and H a u), we have 

m'{x) = m'(^) + 0(^ C3 -^" 1/2 (^-l)" 5/2 ) = m'(^)(l + (9(^- 1 )) 
with C-high probability, where we used (6.5) and (7.9). Using (6.2), (7.10), and (6.5), we conclude that 

/Mo - <>i = -(! + Gv(i) C% +drl + °(^ CiV_5/6 ^ - 1 ^ 1/2 + ^ C< - C2 ^- 1/2 W - l) V2 ) 

with C-high probability. The claim now follows for large enough C*2, using the identity (6.2). □ 

7.2. The GOE/GUE case. By Proposition 7.1, it is enough to analyse the random variable 

X := N^ 2 (\d\ + l)(\d\ - l) 1 / 2 (g vv (9) + ^ , (7.13) 

where v e C N is normalized, satisfies 

l + <f C2 N-^ 3 < |d| < S-l, (7.14) 

and we abbreviated = 9(d). For dcfinitcness, we choose d > 1 in the following. 

The following notion of convergence of random variables is convenient for our needs. 

Definition 7.3. Two sequences of random variables, {An} and {Bn}, are asymptotically equal in distri- 
bution, denoted An ~ Bn, if they are tight and satisfy 



lim (Ef(A N ) - Ef(B N )) = (7.15) 



for all bounded and continuous f. 



Remark 7.4. Definition 7.3 extends the notion of convergence in distribution, in the sense that E/(.A/v) 
need not have a limit as N — > oo. 

Remark 7.5. In order to show that An ~ Bn, it suffices to establish the tightness of either {Ajy} or {Bn} 
and to verify (7.15) for all / e C~(R). Indeed, if {ijv} is tight then so is {Bjv}, by (7.15). By tightness of 
An and Bat, we may replace in (7.15) the bounded and continuous / with a compactly supported continuous 
function g. Next, we can approximate g uniformly with C£°-functions. 

Remark 7.6. Clearly, A N ~ Bjv if A N = B N for all N. 

Lemma 7.7. Let An ~ -Bat and i?jv satisfy lini/v P(\Rn\ ^ £at) = 1, w/iere {eat} is a positive null sequence. 
Then A N ~ Bjv + -Rat- 

PROOF. By Remark 7.5, it suffices to prove (7.15) for / e C 1 (R) such that / and /' are bounded. Then 
Ef(A N ) - Ef(B N + R N ) = (Ef(A N ) - Ef(B N )) + (Ef(B N ) - Ef(B N + R N )) 

= o(1)+e[i(|JJjv| ^s N )(f{B N )- f(B N + R N )) 
= o(l) 

where in the last step we used the boundedness of /'. □ 
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Lemma 7.8. Let {An}, {A' N }, {Bn}, and {B' N } be sequences of random variables. Suppose that An ~ A' N , 
Bn ~ BJv, An and Bn are independent, and A' N and B' N are independent. Then 



A N + B 



N 



A N + B' N . 



Proof. Without loss of generality, we may assume that An,Bn,A' n ,B' n are independent (after replacing 
A' N and B' N with new random variables without changing their laws.) Then for any A € M we have 



£ e iA(A N +B N ) _ jg e iA(A' N +B^) _ g 



c i\A N ^ Q i\B N _ giAB^-j ^i\A N _ e i\A' N ^iAB^, 



Ee iAAw E(e : 




iABi\ 



JAB' 



)+E(e 



e iAA '«)Ee iA - B « 



as N — > oo. 

Next, we observe that A w + Bjy ancl ^4jv + ^jv are tight- Therefore, recalling Remark 7.5, we find that 
it suffices to prove 

Ef{A N + B N )-Ef(A' N + B' N ) — ► 



/ € G^°. Denoting by / the Fourier transform of /, we find 



Ef(A N + B N )-Ef(A' N + B' N ) 



d\f(X) 



ggiAfAjv+Bjv) _ ^ e iMA' N + B' N ) 



by dominated convergence. 



□ 



PROPOSITION 7.9. Let H be a GOE/GUE matrix. Assume that d satisfies (7.14). Then for large enough 
C2 we have 

d f 2(rf+l) 

PROOF. By unitary invariance, we have G vv = Gu, where = denotes equality in distribution. In or- 
der to handle the exceptional low-probability events, we add a small imaginary part to the spectral pa- 
rameter z ■■= + iN~ 4 . Throughout the following we abbreviate G = G(z) and m = m(z). Writing 
a* := (h\2, h\s, . . . , Hin), we get from Schur's formula and (2.5) that 



G u = 



1 



1 



hu-z- a*GWa -m - z + hu - (a*GWa - m) 

= m-m 2 h 11 +m 2 (a.*G (1) a-m) + 0(|^ n | 2 ) + o(|a*G (1) a - 



(7.16) 



with C-high probability. Again by unitary invariance, we have a*G^a = HapG^. Moreover, both sides 
are independent of hu, so that 



-m 2 h 11 +m 2 (a.*G (1) a-m) = -m 2 h n + m 2 (||a|| 2 G$ - m) . 
In order to estimate the error term in (7.16), we write 



(7.17) 



\a\\ 2 G$-m = 



| 2 -l)G« + (G«-m) 



(7.18) 
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Using (3.6) to estimate G 2 X 2 — G22, as well as Theorem 2.3, Lemma 3.5, and Lemma 3.2, we therefore find 
that 

|||a|| 2 GW-m| < ^N-^id-l)- 1 / 2 (7.19) 

with C-high probability. Moreover, we have the trivial bound E| ||a|| 2 G 22 ' — m\ k < (kN) ck for k e N. 

From (7.16), (7.17), (7.18), and (7.19), we conclude that there exist random variables R\ and R 2 satisfying 

|#i| + |fl 2 | ^JV-^d-l) -1 (7.20) 
with C-high probability, the rough bound 

E(\R 1 \ + \R 2 \) k < (kN) Ck , (7.21) 

and 

(G ( $-m)+R 1 = -m 2 h 11 +m 2 (a.*G (1) a-m) 
i -m 2 h 11 +m 2 (\\a\\ 2 G { 22 ) -to) 
= -m 2 hu + m 3 (||a|| 2 - l) + m 2 (G$ - m) + R 2 ■ 

Defining 

Yi := iV 1 / 2 (d+l)(rf-l) 1 / 2 Rc(Gf 1 ) -m), Y 2 := A^/ 2 (d + l)(d - 1) 1/2 Re(G$ - m) , 
:= iV 1 / 2 Rc(-m 2 ^ 11 +m 3 (||a|| 2 -l)) , R, := A^/ 2 (d + l)(d - 1) 1/2 Rcfl, (i = l,2), 

we therefore get 

Ki+iZi = (d+l)(d-l) 1/2 TL r + m 2 r 2 +i? 2 . (7.22) 

In order to infer the distribution of Y\ from (7.22), we observe that the random variables Y 2 and W are 
independent. Also, Y\ = Y 2 . Recalling Theorem 2.3 and (3.6), we find the bounds 

\Yi\<<P C< , < ^AT- 1/2 (d-l)- 1/2 (i = 1,2) (7.23) 

with C-high probability, and the rough bounds 

\Yi\ < N 2 , E\Ri\ k < (kN) ck (i = 1,2). (7.24) 

Moreover, by the Central Limit Theorem 

where we used (6.2). 

Next, let B and Z 2 be independent random variables whose laws are given by 
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where we introduced 



2 _ 2 _ 4 (d 2 -l)(d+l) 2(d 2 + l) _ 2(d+l) 



d 4 -l /3d 6 /3d 2 ' 

Denning 

Zi := (d+l)(d-l) 1/2 S + d- 2 Z 2 , (7.26) 
we find that Z\ = Z^. Moreover, a standard moment calculation and the definition of W yield 

lim (EW k ~ EB k ) = ; (7.27) 

as usual, only the pairings in the moment expansion of EW k survive the limit N — > oo. (See also (7.25), 
which however cannot be used to deduce (7.27) directly.) 

We now compare the distributions of Y\ and Z\ by computing moments. Note that the family {EZ k }j^ e ^ 
is bounded for each k € N. We claim that 

lim (EYi fc - EZ\) = (7.28) 

for all k e N. (This will imply that Yi ~ Z 1 .) We shall prove (7.28) by induction on k. Taking the 
expectation of (7.22) yields 

EYi = m 2 EY! + 0(<p c <N- 1 / 2 (d- 1)~ 1/2 ) 

where we used (7.23), (7.24), and EW = O^" 1 / 2 ). Therefore 

EYi s$ Ctp c <N-V 2 (d-l)- 3 P = o(l) 

provided C2 in (7.14) is large enough. Here we used that 

m(z) = d^+O^ -3 ), (7.29) 

as follows from the definition of z = 9 + \N , (5.5), Lemma 3.2, and (6.2). Therefore (7.28) for k = 1 
follows using EZi = 0. 

For the induction step, we assume that (7.28) holds for all k' < k — 1. From (7.22) we find 

ey^ + ^Qe^y*-') 

= E((d + l)(d - if^W + m 2 Y 2 f + (*) E (^'((d + - !) 1/2w/ + m 2 ^)*" 1 ) • (7.30) 
We estimate the summands on the left-hand side by 

\E(R[Y^~ 1 ) I s$ iY c exp(-^ c )+ (^ c <7V- 1/2 (d-l)- 1/2 ) ElYxl' 1 -' 
cfa^N-Wid-l)- 1 ' 2 ) 1 ^ 

< ^iV- 1 /2 (d _l)-l/2 ; 
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where in the first step we used (7.23) and (7.24), in the second step the estimate E|Yi| fc ~' ^ (p C( as follows 
from the induction assumption (7.28) applied to even moments (recall that Y\ is real) as well as (7.23) and 
(7.24), and in the third step the fact that 1^1. Note that the constant is independent of k. A similar 
estimate applies to the summands on the right-hand side of (7.30). Thus (7.30) yields 



EYf = md+l^d-lf^W + m 2 Y 2 ) k + 0(v C< N- 1 l 2 {d-l)- 1 ' 2 ) 

= m 2k EYf + J2 (T) E (( d + - l) 1/2 W) l E(m 2 Y 2 ) k - 1 + O^N^id - I)- 1 ' 2 ) , 



1=2 



where in the second step we used the induction assumption and the estimate EW = 0(N 1 I 2 ). Therefore 
we get 

EYf = Y~^2k X/ f/l^(^ + l) 1/2 W) 1 E(m 2 Y 2 ) k ~ l + 0((p c< N~ 1/2 (d - 1)~ 3 / 2 ) , (7.31) 

m 1=2 ^ ' 

where we used (7.29). 

In order to conclude the proof of (7.28), we deduce from (7.26) that 

EZ\ = Y ^ k j^^E{{d + l){d-lfl 2 B) l E{d- 2 Z 2 ) k - 1 . (7.32) 

Using the induction assumption (7.28) for k' — k — I, (7.29), and the condition I > 2, we get from (7.31), 
(7.32), and (7.27) that 

lim (EYi — EZ,) = 

for large enough C 2 . This concludes the proof of (7.28). 

Next, by definition we have = A/"(0, 1). Moreover, we have that £ e [c, C] for some positive 

constants c and C depending only on S. Together with (7.28) for k = 2, we infer that the families {C _1 i / i}A r eN 
and Zij^gN are tight. Therefore we get from (7.28) that 

lim (E/(r^i) - E/(r^i)) = (7.33) 
for any continuous bounded function /. Next, we estimate 

\G u (0)-G${z)\ s$ \Gn(0)-Gu(z)\ + \G n (z)- G$(z)\ 

sC N- 4 N 2 + ( P c <N- 1 {d- I)- 1 < ip^N-^d- l)" 1 

with C-high probability, where in the second step we used Lemma 7.2, (5.5), and Lemma 3.2 to estimate the 
first term, and Theorem 2.3 and (6.1) to estimate the second term. Therefore 

X = N 1 ' 2 {d+l){d-l) 1 ' 2 {G 11 {e)+d- 1 ) = Y 1 +0(<p c <N- 1 ' 2 (d-l)- 1 ' 2 ) = Yi+o(l) 

with C-high probability, where in the second step we used (7.29). Therefore (7.33), the fact that Z = Z\, 
and dominated convergence yield 

limjEf(C 1 X)-Ef(C 1 Z)) =0. (7.34) 
The claim now follows from Lemma 7.10 below. □ 
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Lemma 7.10. Let {£n} be a bounded deterministic sequence. Let A^, A\, A2, . . . be random variables such 
that An converges weakly to . Then we have for any bounded continuous function f 

Efi^A^-Efi^A^) — ► 

as N — > 00. 

Proof. By Skorokhod's representation theorem, there exist new random variables A^, Ai,A 2l ... such that 
= A x , A N = A N for all N e N, and A N — > almost surely. Let w be such that ^4w( w ) — > ^co( w )- 
By assumption on £ N , we find that there exists a C = C(cj) such that £,nAn(uj) e [— C, C] and £jv-^oo( w ) € 
[— C, C] for all JVeN. Since / is uniformly continuous on [— C, C], we find that 

lim (ffaAnW-ffaAniw))) = 0. 

JV— s-oo \ / 

The claim now follows by dominated convergence. □ 



7.3. The almost-GOE/GUE case. As it turns out, replacing the matrix clement hij with a Gaussian in 
the Green function comparison step below (Section 7.4) is only possible if \vi\ ^ <f~ D and \vi\ ^ <f~ D , for 
some large enough constant D > 0. If this assumption is not satisfied, we first have to replace hij with a 
Gaussian using a different method, which effectively keeps track of the fluctuations of G vv resulting from 
large components of v. Thus we shall proceed in two steps: 

(i) We compare the original Wigner matrix H with H, a Wigner matrix obtained from H by replacing 
the (i, j)-th entry of H with a Gaussian whenever \vi\ ^ (p~ D and \vj\ ^ (p~ D . 

(ii) We compare the matrix H to a Gaussian matrix. 

The step (ii) is performed in this section. To simplify notation, we write H instead of H throughout this 
section. The step (i) is performed using Green function comparison in Section 7.4 below. 
The following shorthand will prove useful. 

Definition 7.11. Let {o~n} be a bounded positive sequence. If An and Bn are independent random variables 
with Bn ~ A/"(0, o- 2 N ), and if Sn ~ An + Bn, then we write 

S N A A N +M{0,al). 

For the following we write 

X = vN 1 ' 2 (g vv (6) + ^ , v = v N := (d + l)(d - 1) 1/2 . 

Proposition 7.12. Fix D > 0. Lei v e C w &e normalized and H be a Wigner matrix such that if\v t \ ^ ip~ D 
and \vj\ ^ t^ -15 £/ien hij is Gaussian. Then we have 

where Q(v) and R(v) were defined in (2.22). 
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Proof. As before, we consistently drop the spectral parameter z = 9 from our notation. 

Let M £ N denote the number of entries of v satisfying \vi\ > <f~ D . Since v is normalized, we have 
M ^ (f 2D . To simplify notation, we assume (after a suitable permutation of the rows and columns of H) 
that the entries of v satisfy \vi\ > tp~ D for % < M and \vi\ < (p~ D for i > M. Split v = (^), where u £ C M 
and w £ C N ~ M . (Throughout the following we assume that w ^ 0; the case w = may be easily handled 
by approximation with nonzero w.) We also split 



H 



A B* 
B H 



where A is an M x M matrix and Hq an (N — M) x (N — M) matrix with Gaussian entries. Choose a 
deterministic orthogonal/unitary (N — M) x (N — M) matrix S such that Sw = (||w||, 0, . . . , 0)*. Thus wc 
get 

■ ft 0\(A-z B* VV 1 °\ f 1 °\ 
G vv - v ! , .. J ^ s)\ B H Q -z) [O S*)\0 s) v 

A — z B*S* 
SB H -z 

where we used that SH S* = H and the fact that A, B, and H arc independent. 
Next, we split 

s - (-f ») . n = (I I 

where a £ C Ar_M_1 is a vector of i.i.d. Gaussians. Note that S* is an isometry, i.e. SS* — 1. Thus we may 
write 






-1 








) 1 









=: r, (7.35) 

where the second equality defines the right-hand side using self-explanatory notation. Note that, by defini- 
tion, ||x|j = ||v|| = 1. 
Next, we claim that 

(F*F)ij = 5 13 +0{ V C <N- 1 ' 2 ) (7.36) 
with C-high probability. In order to prove (7.36), write 

(B*S*SB B*S*a\ 



F*F 



a* SB a* a 
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We consider four cases. First, if 1 ^ i ^ j < M we find using (3.15) that 

n / \ 1/2 



k,l 



AT 



TV 



with C-high probability. Second, if 1 < i < M we find using (3.13) and (3.14) that 



\(F*F) u -l\ 



'^2 B * k (S*S)kiB li - 1 

k,i 



+ 



(f c ( 



J2\(s*s) 

\ k,l 



\ 1/2 

2 < ^7V-V2 



with C-high probability. Third, for i = M + 1 we have by (3.13) 

\(F*F) u -l\ = |a*a-l| s? ^N' 1 ' 2 
with C-high probability. Finally, for 1 < i < j = M + 1 we have by (3.15) 

\ 1/2 



|(^)«| = 



k,l 



N 



v k,l 



ip C < /_ ~\ V2 

~N 



with C-high probability. This completes the proof of (7.36). 

Next, abbreviate G^z) := (i?i - z)' 1 . Since N X / 2 (N - M — l)- 1 ' 2 ^ is an (N - M - 1) x (N - M - 1) 
GOE/GUE matrix, we find from (7.36), Theorem 2.3, and Lemma 3.2 that 



{F*G 1 F) ij - 6 ijm < ^N-^id - I)- 1 ' 2 



with C-high probability. Therefore Schur's formula yields 



(7.37) 



T = x*(-z-m- (-E + F*G 1 F-n 

= m||x|| 2 -m 2 (x,Sx) +m 2 ((Fx,GiFx) -m||x|| 2 ) + o(^p c <N- 1 {d - l)" 1 ) . (7.38) 

with C-high probability, where in the second step we expanded using (2.5), and estimated the error term 
using (7.37) as well as the bounds M < tp c < and \Eij\ ^ tp^N^ 1 / 2 . Recalling that ||x|| = 1, we find 



r — m = —m 



A 

w||/ Vw*B/||w|| 



B*w/||w| 



u 

Iwl 



-m 2 ||.Fx|! 



r (Fx,GiFx) -m 



+ m 3 (||Fx|| 2 - 1) + O^N-^d- l)" 1 ) (7.39) 



with C-high probability. 

Next, from Fx = SBu + llwlla we find 



||Fx|| 2 = (Bu,S*SBu) +2||w||Re(Bu,5*a) + ||w|| 2 ||a|| 2 

= (Bu,Bu) - |(w,Bu)| 2 + 2||w||Re(Bu,5*a) + ||w|| 2 ||a|| 2 . 
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Applying (3.12) to (w,Bu) = Wi u j B ij ( witn N in (3-12) replaced by M(N - M)), we find 

|(w,Bu)| 2 s$ (p^N- 1 . 
Similarly, using (3.13) and (3.14) we find that 

||Bu|| 2 = Wuf + O^N- 1 / 2 ), \\SBu\\ 2 = Uuf + O^JVT 1 / 2 ) 
with C-high probability, using (3.15) that 

\(Bu,S*a)\ < ip c <N- 1/2 
with £-high probability, and using (3.13) that 

||a|| 2 = l + O^N- 1 ' 2 ) 
with C-Iiigh probability. Using ||u|| > ip~ D (by definition of u), we therefore conclude that 



||^x|| 2 = ||i?u|| 2 + 2Rc ™^ (SBu, a ) + ||w|| 2 ||a|| 2 + 0(y c < AT 1 ) = 1 + N^ 2 ) (7.40) 

l|5Bu||||a|| 

with C-high probability. Using Theorem 2.3 applied to G\ (recall that F and Hi are independent), we 
therefore get from (7.39) that 

r-m = -m 2 ((u,^u) + ||w|| 2 5 + 2Rc(w,Bu)) +m 2 ^p^( J F 1 x,Gi J F 1 x) -m^J 
+ - 3 (||Bu|| 2 - ||u|| 2 + 2 Re - *jL(SBu,a) + || w|| 2 (||a|| 2 - l)) +o{f<N~\d - l)" 1 ) (7.41) 
with C-high probability. We write this as 

r-m = r\ + • • • + r 6 + o^N-^d - 1)- 1 ) (7.42) 

with £-high probability, where 

ri := -m 2 (u,Au), T 2 := -m 2 ||w|| 2 3 , T 3 := m 2 ^pLp(Fx,G 1 Fx) - mj , 

T 4 := -2m 2 Rc(w,Bu) +to 3 (||Bu|| 2 - ||u|| 2 ) , r 5 := 2m 3 Re MJl^ (5Bu,a), 

||5Bu||||a|| 

T 6 := m 3 ||w|| 2 (||a|| 2 -l). 

We now claim that Ti, . . . , r 6 are independent. In order to prove this, let /i, . . . , f% be indicator functions 
of Borel sets in E. Write a = au in polar coordinates, where a > and uj e S N ~ M ~ 2 . Since a is Gaussian, 
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a and u> are independent. Denote by pi, . . . , pe the laws of A, B, g, a, u), H\ respectively. Then we get 

6 „ 6 

^HfiFi) = dp 1 (A)dp 2 (B)dps(d)dp i (a)dp 5 (u)dp 6 (H 1 )l[f i (T i ) 
i=i J »=i 

= (E/ 1 (r 1 ))(E/ 2 (r 2 ))(E/ 6 (r 6 )) J d P2 {B)dp 6 (u)d Pfi (H 1 )f 3 (r 3 )U(T 4 )f 6 (r B ) 
= (E/ 1 (r 1 ))(E/ 2 (r 2 ))(E/ 6 (r 6 ))(E/ 3 (r 3 ))(E/ 5 (r 5 )) J d P2 (s)/ 4 (r 4 ) 

6 

where the second equality follows by definition of the T's, and the third from the invariance of the law 
of lj under rotations (applied to T 5 ) and from the invariance of the law of Hi under orthogonal/unitary 
conjugations (applied to L 3 ). This proves the independence of r 1? . . . ,r 6 . 

Next, we identify the asymptotic laws of Ti, . . . , Tq. There is nothing to be done with Ti. By definition, 

V N 1 I 2 Y 2 = A^O^/r 1 ™ 4 !^!! 4 ) . (7.43) 
Since Fx is independent of Hi and M < ip 2D , we get from Proposition 7.9 that 

uN^Ts £ A^0,m 4 ^±^). (7.44) 

In order to analyse L 4 , we define bi ■= (5u)j for i = 1, . . . , N — M. Then {bi}i are independent and satisfy 

E6i = 0, E|6 4 | 2 = l||u|| 2 , E|6,| 4 = ^||u|| 4 + ^ ^(mJ> - 4 + . 

3 

Thus we find 

T 4 = ^(-2m 2 Re«J^+m 3 (|6 l | 2 -E|6 4 | 2 )J +0(M/N). 

i 

The variance of the term in parentheses is 

E(^-2m 2 Rew J & i + m 3 (|fei| 2 -E|6,| 2 )) 2 

= 4m 4 E(RcwJ i 6 l ) 2 -4m 5 ERe((w l 6 i )|6 4 | 2 ) +m 6 E(|6 4 | 2 -E|6i| 2 ) 2 
= 4m 4 /3- 1 ^- 1 ||u|| 2 |^| 2 -4m 5 7V- 3 / 2 Rc('w^M( J 3) % -| % -| 2 ") 

^3 

+ ™ 6 N- 2 ((3 - /3)||u|| 4 + £(M#> - 4 + /?) | u ,| 4 ) . 

j 

Since |tOj| s; <p~"°, we get from the Central Limit Theorem and Lemma 7.10 that 

vN^Tt & AA^0,^ 2 ^||u|| 2 ||w|| 2 -4^ 2 m 5 O(w,u) + i , 2 m 6 (2r 1 ||u|| 4 + i?(u))^ , (7.45) 
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where we abbreviated 

Q(w,u) := AT-VSRe^MgV-Kf, R(u) := 1^-4 + ^) 



TV ■ 



and used that 3-/3 — 2(3 1 for /? = 1,2. Since ||u|| < 1 and ||w|| < 1, we find that Q(w,u) < C and 
R(u) < C for some positive constant C. Next, using F 5 = 2m 3 ||w|| Re(SBu,a) + O^^N^ 1 ) with C-high 
probability and 

E(2Rc(^u,a)) 2 = ^L(7V_ M -l)||u!| 2 , 

we find from the Central Limit Theorem and Lemma 7.10 that 

yN 1 ' 2 T b ~ Ar(0,4i/ 2 /9- 1 m 6 ||u|| 2 ||w|| 2 ) . (7.46) 
Finally, we have ||a|| 2 - 1 = ||a|| 2 - E||a|| 2 + 0(M/N) and 

E(H 2 -EH 2 ) 2 = 2(3- 1 N- 2 . 

Thus we conclude from the Central Limit Theorem and Lemma 7.10 that 

uN^ 2 T 6 ^ Af (0,2^13^ m 6 \M 4 ) . (7.47) 

Next, (7.43) - (7.47) imply that WV 1/2 r 2 , . . . , WV 1/2 r 6 are tight (as TV-dependent random variables). 
Moreover, an easy variance calculation shows that z^iV 1 / 2 F 1 is also tight. Therefore we get from (7.35), 
(7.42), (7.43) - (7.47), Lemma 7.7, and Lemma 7.8 that (recall the notation from Definition 7.11) 



X ~ -isN 1 / 2 m 2 (u,Au)+Af(0,V 1 ) : 



where 



2(d+l) 2v 2 ... ... .. ..o,, ll2 x 4j/ 2 

Vi •= ^ + ^(||wr + 2||u|| 2 ||w|| 2 ) + -^g(w,u) 



+ ^i?(u) + |^(||u|| 4 + 2||u|| 2 ||wj| 2 + ||w|| 4 ). 



Here we used (6.2). 
Next, from 



(v,i2v) = (u,4u) + 2Rc(w,73u) + (w,H w) , 
the Central Limit Theorem, Lemma 7.10, and Lemma 7.8 we find 

vNV 2 (v,Hv) ~ !a/V 1/2 (u,Au) +M(o,^-(l- ||u|| 4 )^ . (7.48) 

Moreover, using that the dimension M of u satisfies M ^ ip 2D and the fact that maxj|w;j| «C <f~ D , we find 

Q(w,u) = Q(v) + 0(^- D ) , R(u) = R(v) + 0(<p- 2D ) . 
Therefore we get, using Lemma 7.8 and recalling that 1 = ||v|| 2 = ||u|| 2 + ||w|| 2 , 

x t -^- v , flv)+ ^„,^ + ^ (v) + ^ (v) + ^ 

This concludes the proof. □ 
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7.4. Conclusion of the proof of Theorem 2.14. In this section we compute the distribution of G vv (9) — m(6) 
for a general Wigner matrix H, and hence complete the proof of Theorem 2.14. We use the Green function 
comparison method from the proof of Lemma 3.9. 

Let H = (hij) = {N~ 1 / 2 Wij) be an arbitrary real symmetric / Hermitian Wigner matrix, V = (N~ 1 / 2 Vij) 
a GOE/GUE matrix independent of H, and v £ be normalized. For D > define the subset 

Id ■= {* = 1 JV : \vi\ ^ . 

Define a new Wigner matrix H = (hij) = (N~ 1 / 2 Wij) through 



Vij if i £ I D and j £ I D 



'13 

Wij otherwise . 



Thus, H satisfies the assumptions of Proposition 7.12. Let 

J D := {1 < i < j s$ N : i £ I D and j £ I D } 

be the set of matrix indices to be replaced. Similarly to (3.21), we choose a bijective map <j> : Jd — > 
{1, . . . ,7max(-D)} and denote by H 1 = the matrix defined by 

hl . (N-^Wij if < -y 

ij ' yN^^Wij otherwise. 

In particular, Hq = H and -ff 7max (D) = H- Let now (a, b) £ Jd satisfy <p(a,b) = 7. Similarly to (3.22), we 
write 

tf 7 _! = Q + N-^V where V := V ab E^ + l(a £ b)V ba E^ , 

and 

H 1 = Q + N~ 1/2 W where W ■= W ab E {ah) + l(a ^ b)W ba E {ba) . 

In order to avoid singular behaviour on exceptional low-probability events, we add a small imaginary 
part to the spectral parameter 9, and set z ■= 9 + iN~ 4 . Abbreviate 

x := vN 1 / 2 Re{G vv {z)-m{z)). (7.49) 

Thus we have the rough bound |x| $5 N which we shall tacitly use in the following. We use the notation 
(3.23), which gives rise to the quantities xr,xs,xt defined through (7.49) with G replaced by R,S,T 
respectively. We may now state the main comparison estimate. 

Lemma 7.13. Provided D is a large enough constant, the following holds. Let f £ C 3 (R) be bounded with 
bounded derivatives and q = q^ be an arbitrary deterministic real sequence. Then 

Ef(x T + q) = Ef(x R + q)+ Y ab Ef'(x R + q) + A ab + O^^ab) , (7.50) 
Ef(x s + q) = Ef(x R + g)+ A ab + O^iat,) , (7.51) 

where A ab satisfies \A ab \ ^ ip^ 1 , 

Y ab := -vN^Re^M^VaVb + rr^M^VbVa) , 
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and 

CT,T = cr=0 

Before proving Lemma 7.13, we show how it implies Theorem 2.14. 

Proof of Theorem 2.14. Fix D > large enough that the conclusion of Lemma 7.13 holds. By Remark 
7.5, we may assume that / <E (7£°(R). Let 7 = (f>(a, b). Since \v a \ < <p~ D and \v b \ < ip~ D , we find 

Y 2 b < p" 1 ^. (7.52) 

Applying (7.50) and (7.51) with / replaced by /' yields 

Y ab Ef(x T + q) = Y ab Ef(x R + q)+Y ab A ab + 0(p- 1 £ ab ). 

Subtracting this from (7.50) and using \A ab \ < (p^ 1 yields 

Ef(x T + q) = Ef(x R + q)+ Y ab Ef(x T + q) + A ab + 0{ip~ l £ ab + ^Y^) . 

Subtracting (7.51) yields 

E/(* 7 + q) = E/(a: 7 _i + q) + Y ab E.f'(x 7 + q) + 0(<p-% b + tp' 1 ^) , 

where we introduced the notation x 1 ■■= vN 1 / 2 Re((i? 7 — z)~* — m(z)). Using (7.52) we therefore get 

E/(.t 7 + q - Y ab ) = E/(x 7 _! + q) + 0(<p-% b + ^ 1 |r ab |) . (7.53) 

We now iterate (7.53), starting at 7 = 1 and q = 0. Using that J2 a b^ ab ^ ^ an( ^ S a &l^af>l ^ we ^ n< ^ 
after 7 max iterations of (7.53) 

/ 7max(D) \ 

E/^ 7max( D)- VM7)J = E/(x )+O(^- 1 ). 

Moreover, using |u a | < tp~ D and \v b \ < i/? -15 , we find that 

7max(D) 



2 V'(7) = -^'Re Ma^b)m(zf(Mf b ] v a v b + M^v h v a ) 

a.belo 

N 



7=1 a.beI D 

N 



a,b=l 

Using Lemma 7.8 we find 

N 

vN x /*({H - z)-l - m(zj) & vN^ 2 {{H - z)^ - m(zj) - vN' 1 Re ]T m(z) 4 M^>v a v b . 

0,6=1 
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Using Lemma 7.2, it is now easy to remove the imaginary part N of z to get 
Since H satisfies the assumptions of Proposition 7.12, we find 

using the notation of Definition 7.11. Now Theorem 2.14 follows from Proposition 7.1 and Lemma 7.7. □ 

Proof of Lemma 7.13. As before, we consistently drop the spectral parameter z = 9 + iN~ A from G and 
to. We focus on (7.50). From Theorem 2.3, (3.29), and (3.28) (with S replaced by T), we find 

|T va | < ip^N-^d-l)- 1 ' 2 + C\v a \, \R va \ < tpCtN-Wid-iyW + Clval+tp^N- 1 ' 2 ^ (7.54) 

with C-high probability, and similar results hold for T av , T vb , T bv , R av , R vb , and R bv . Similarly, from the 
first inequality of (3.30) (with S replaced by T), we get 

|T VV - i? vv | sc ^N-^fN-^d - l)- 1 + N-^id - l)- 1/2 (\v a \ + \v b \) + k|k| + N-^(\v a \ 2 + \v b \ 2 )) 
with C-high probability. This yields 

\x T -x R \ < p 5 < [N-\d - l)- 1 ' 2 + N-V 2 (\v a \ + \v b \) + (d - l) 1/2 \v a \\v b \ + N-^ 2 (d - l) 1/2 (k| 2 + \v b 

(7.55) 

with C-high probability for some constant Cf. Now choose D > + 1. By definition of Jn, we have that 
\v a \ ^ <f~ D and \v b \ < <f~ D - Therefore 

\XT ~ Xr\ 3 < ¥> _1 £a6 

with C-high probability. This yields 

E/fzT + g) = E/(x fi + g )+E(/(^ + g)( a;T -x K )) + iE(/''(x i? + g)(x T - a ; K ) 2 )+0(^- 1 £ ab ). (7.56) 

In order to analyse xt — xr = vN 1 / 2 Re(T vv — i? vv ), we write 

x T -x R = j/i + y 2 + 2/3 + 2/4 , 

where 

J/fc : = 



i/iV 1 / 2 -fc/ 2 Re((-W) fc ii) vv if fc = 1,2,3 
!/j /V- 3 / 2 Re((-W) 4 T) vv if fe = 4 . 

Using (7.54), it is easy to check that y\ is bounded by the right-hand side of (7.55), and that 



\Uk\ 



< v c ^N 1 ' 2 - k ' 2 (N- l {d-l)- 1 / 2 + {d-l) 1 ' 2 {\v a \ 2 + \v b \ 2 )) (fc- 2,3,4) (7.57) 
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with £-high probability. In particular, 

XT- XR = V1+V2 + V3 + 0((p~ 1 £ ab ) 

with £-high probability. Moreover, using, |u | <f~ D , \v b \ ^ <P~ D , (7.57) for k = 2, and the fact that y\ is 
bounded by the right-hand side of (7.55), we find that 

1 2/1 1 1 2/2 1 ^ V~ X £ab 

with £-high probability, provided D is chosen large enough. Similarly, using (7.57) we find that 1 2/fe 1 1 I ^ 
if ~ £ a b for k, k' > 2 for large enough D. Thus we conclude from (7.56) that 

Ef(x T + q) = Ef(x R + q)+E(f(x R + q)y 3 )+A ab + 0(^- 1 £ ab ), 



where 



A ab := E(f(x R + q)(y 1 +y 2 )) + -E(f"(x R + q)y 2 1 ) 



depends on the randomness only through R and the first two moments of W ab . Moreover, from (7.57) and 
the fact that y\ is bounded by the right-hand side of (7.55), we conclude that \A ab \ «C tp -1 - 
What remains is the analysis of the term E + 9)2/3). We shall prove that 

E(f'(x R + q)y 3 ) - Y ab Ef(x R + q)\ < C<p-% b . (7.58) 

If a = b, it is easy to see from (7.57) and the definition of Y ab that 

1 2/3 1 + \Y a b\ s$ ^>~ 1 £ab , 

from which (7.58) follows. 

Let us therefore assume that a 7^ b. We multiply out the matrix product in ((— RW) 3 R) vv and regroup 
the resulting eight terms according to the number, r, of off-diagonal matrix elements (R ab or R ba ) of R. 
(By convention, the endpoint matrix elements R v . and R. v are not counted as off-diagonal.) This gives, in 
self-explanatory notation, y 3 = ^2 r=0 y3, r - Using Theorem 2.3 and (7.54), we find 



|2/3,i| + |2/3, 2 | < /^-^(^-^(d-ir^ + M + H) sC tp-'S, 

with C-high probability . Therefore it suffices to prove that 

E(f(x R + q)y 3fi )-Y ab Ef(x R + q)\ < C^ 1 ^ 

for a^b. By definition, 

V3,o = -vN^ 1 Re^R va W ab R bb W ba R aa W ab R bv + R vb W ba R aa W ab R bb W ba R, 
Using (7.54) and Theorem 2.3 we find 

2/3,0 + vN- 1 Rc(m 2 \W ab \ 2 (R va W ab R bv + R vb W ba R av )} < ^Zab 



ab 



(7.59) 
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with C-high probability. We only deal with the first term of 2/3,0; the second one is dealt with analogously. 
Recalling the definition of Y ab , we conclude that, in order to establish (7.59), it suffices to prove 



C(f £ ab 

with C-high probability; here we used that R is independent of W ab . 

Setting u = (wj) with Ui ■= l(i ^ {a, b})vi and recalling (3.9) and (3.10), we get 

Rva = VaRaa+VbRba+ Rua = v a m + v a (R aa - m) + v b R ba + mTZ ua + (R aa - m)K v 

where we defined 



(7.60) 



(7.61) 



(a) 



n — - V R {a) h 



(b) 



see (4.20). The second and third terms are estimated using (7.54) and Theorem 2.3: 

\R aa -m\ + \R ba \ sC ip^N-^id-l)- 1 / 2 



(7.62) 



with C-high probability. Moreover, since 

R (a) = T (o) j we find from Lcmma (3.12), Theorem 2.3, and (3.8) 

that 



/ (o) \ 1/2 



^ ^^^(h^ + N-Hd-iy^y < ^JV-V^-l)-V2 (7 . 63) 



with C-high probability. A similar estimate holds for lZ bu . Using (7.61), (7.62), (7.63), and (7.54) we get 



E 



{f'&R + q)(R va R bv - m 2 v a v b )^j 



< vN- 1 



E 



f'(xn + q) (m 2 v b 1l ua + m 2 v a K bu + m 2 TZ ua TZ bu ^ + dp 1 £ ab (7.64) 



with C-high probability. 

What remains is to estimate the right-hand side of (7.64). Defining 



(a) 



:= vN 1 ' 2 Re(R<$ 



m) 



we find from (3.8) and (7.54) that 

\x R - 4 a) I sC <p c < (N- 1 ' 2 ^ - I)- 1 ' 2 + NV\d - l) 1/2 \v a \ 2 + N-V 2 (d - l) 1/2 \v b \ 2 ) 

with C-high probability. Using (7.63) and using that the derivative of / is bounded, we may estimate the 
first term of (7.64) as 



E 



f'(x R + q)v b K 



E 



f \x { ^ ] + q)v b K ua +C(p 1 £ ab = Op 1 £ ab 
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with £-high probability. In the second step we used that x^ 1 is independent of the the a-th column of Q 
and that E a 7?. ua = 0. The second term of (7.64) is similar. In order to estimate the third, we have to make 
TZbu independent of the a-th column of Q. (See the definition (3.22).) We estimate, using (3.8), = T^ b \ 
(3.12), and (7.54) 



(ab) 



^ \h ba T^\ + 



(ab) 

^2 h bi 



T (b) T (b) 

T {b) 
1 aa 



< ^{N-\d-l)-^ + N-^\v a \)+^N-^(^ 

< y c i (JV-^d - I)" 1 + N-^ 2 (d - l)- 1/2 \v a \) 



1/2 



with £-high probability. Thus we may estimate the third term of (7.64) by 



vN- 1 



E 



f(x R + q)TZ ua TZ bu 



^ vN- 1 



E 



(ab) 



f(x^+q)n ua J2hbiR^ 



(ab) 



+ Ctp 1 £ a b = Clfi l £ a b- 



0. This concludes the proof of (7.60), and hence of 



where in the second step we again used that E a 7£ u 
(7.50). 

The proof of (7.51) is almost identical to the proof of (7.50), except that E|V a {,| 2 V a {, = 0, so that the 
left-hand side of the analogue of (7.60) vanishes. Note that, by definition, A ab depends only on R and on 
the first two moments of W ab , which coincide with those of V ab . Hence A ab is the same in (7.50) and (7.51). 
This concludes the proof. □ 
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